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Preface 


This volume is the record of an instructional conference on number theory 
and arithmetic geometry held from August 9 through 18, 1995 at Boston 
University. It contains expanded versions of all of the major lectures given 
during the conference. We want to thank all of the speakers, all of the 
writers whose contributions make up this volume, and all of the “behind- 
the-scenes” folks whose assistance was indispensable in running the con- 
ference. We would especially like to express our appreciation to Patricia 
Pacelli, who coordinated most of the details of the conference while in 
the midst of writing her PhD thesis, to Jaap Top and Jerry Tunnell, who 
stepped into the breach on short notice when two of the invited speakers 
were unavoidably unable to attend, and to Stephen Gelbart, whose courage 
and enthusiasm in the face of adversity has been an inspiration to us. 

Finally, the conference was only made possible through the generous 
support of Boston University, the Vaughn Foundation, the National Secu- 
rity Agency and the National Science Foundation. In particular, their gen- 
erosity allowed us to invite a multitude of young mathematicians, making 
the BU conference one of the largest and liveliest number theory confer- 
ences ever held. 


January 13, 1997 G. Cornell 
J.H. Silverman 
G. Stevens 
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Introduction 


The chapters of this book are expanded versions of the lectures given at 
the BU conference. They are intended to introduce the many ideas and 
techniques used by Wiles in his proof that every (semi-stable) elliptic curve 
over Q is modular, and to explain how Wiles’ result combined with Ribet’s 
theorem implies the validity of Fermat’s Last Theorem. 

The first chapter contains an overview of the complete proof, and it 
is followed by introductory chapters surveying the basic theory of elliptic 
curves (Chapter II), modular functions and curves (Chapter III), Galois 
cohomology (Chapter IV), and finite group schemes (Chapter V). Next we 
turn to the representation theory which lies at the core of Wiles’ proof. 
Chapter VI gives an introduction to automorphic representations and the 
Langlands-Tunnell theorem, which provides the crucial first step that a cer- 
tain mod 3 representation is modular. Chapter VII describes Serre’s conjec- 
tures and the known cases which give the link between modularity of elliptic 
curves and Fermat’s Last Theorem. After this come chapters on deforma- 
tions of Galois representations (Chapter VIII) and universal deformation 
rings (Chapter IX), followed by chapters on Hecke algebras (Chapter X) 
and complete intersections (Chapter XI). Chapters XII and XIV contain 
the heart of Wiles’ proof, with a brief interlude (Chapter XIII) devoted to 
representability of the flat deformation functor. The final step in Wiles’ 
proof, the so-called “3-5 shift,” is discussed in Chapters XV and XVI, and 
Diamond’s relaxation of the semi-stability condition is described in Chap- 
ter XVII. The volume concludes by looking both backward and forward in 
time, with two chapters (Chapters XVIII and XIX) describing some of the 
“pre-modular” history of Fermat’s Last Theorem, and two chapters (Chap- 
ters XX and XXJ) placing Wiles’ theorem into a more general Diophantine 
context and giving some ideas of possible future applications. 

As the preceding brief summary will have made clear, the proof of 
Wiles’ theorem is extremely intricate and draws on tools from many areas of 
mathematics. The editors hope that this volume will help everyone, student 
and professional mathematician alike, who wants to study the details of 
what is surely one of the most memorable mathematical achievements of 
this century. 


AN OVERVIEW OF THE PROOF OF 
FERMAT’S LAST THEOREM 


GLENN STEVENS 


The principal aim of this article is to sketch the proof of the following 
famous assertion. 


Fermat’s Last Theorem. For n > 2, we have 


a™ + 6" = c™ 


FLT(n) : eh GOT, 


\ — abe =0. 


Many special cases of Fermat’s Last Theorem were proved from the 
17th through the 19th centuries. The first known case is due to Fermat 
himself, who proved FLT(4) around 1640. FLT(3) was proved by Euler 
between 1758 and 1770. Since FLT(d) => FLT(n) whenever d\n, the re- 
sults of Euler and Fermat immediately reduce our theorem to the following 
assertion. 


Theorem. If p > 5 is prime, and a,b,c € Z, then a? + bP + cP =0 => 
abc = 0. 


The proof of this theorem is the result of the combined efforts of innumer- 
able mathematicians who have worked over the last century (and more!) 
to develop a rich and powerful arithmetic theory of elliptic curves, modular 
forms, and galois representations. It seems appropriate to emphasize the 
names of five individuals who had the insight to see how this theory could 
be used to prove Fermat’s Last Theorem and to supply the final crucial 
ingredients of the proof: 


Gerhart Frey (1985), who first suggested that the existence of a solu- 
tion of the Fermat equation might contradict the Modularity Conjecture 
of Taniyama, Shimura, and Weil; 


Jean-Pierre Serre (1985-6), who formulated and (with J.-F. Mestre) 
tested numerically a precise conjecture about modular forms and galois 
representations mod p and who showed how a small piece of this conjec- 
ture —- the so-called epsilon conjecture —- together with the Modularity 
Conjecture would imply Fermat’s Last Theorem; 


Ken Ribet (1986), who proved Serre’s epsilon conjecture, thus reducing 
the proof of Fermat’s Last Theorem to a proof of the Modularity Conjecture 
for semistable elliptic curves; 


Richard Taylor (1994), who collaborated with Wiles to complete the 
proof of Wiles’s numerical criterion in the minimal case; 
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Andrew Wiles (1994), who had the vision to identify the crucial numer- 
ical criterion from which the Modularity Conjecture for semistable elliptic 
curves would follow, and who finally supplied a proof of this criterion, thus 
completing the proof of Fermat’s Last Theorem. 


To prove the theorem we follow the program outlined by Serre in [16]. 
Fix a prime p > 5 and suppose a,b,c € Z satisfy a? + bP +c? = 0 but 
abc #0. The triple (a?,b”,c?) is what Gerhard Frey has called a “remark- 
able” triple of integers, so remarkable in fact, that we suspect it does not 
exist. To derive a contradiction, we will transform this triple into another 
object with remarkable properties, namely a very special modular form 
fap ,be cp, Something firmly rooted in the fertile grounds of modern number 
theory. The construction of this modular form is a two-step process. First, 
by a simple but insightful construction due independently to Yves Helle- 
gouarch and Gerhard Frey, we obtain a certain semistable elliptic curve 
Ear ,be,cp defined over Q. Then, by Wiles’s semistable modularity theorem, 
we deduce the existence of a modular form fg» 2 ,-p associated to Hap pp cp 
by the correspondence of Eichler and Shimura. 

With far te,ce in hand, we seek a contradiction within the realm of 
modular forms. The crucial ingredients that finally lead to a contradic- 
tion are encoded in a certain irreducible galois representation Dy» pp co : 
G —- GL.2(F,) associated to fa» ye,-p. AS noted by Frey and Serre, the re- 
markableness of the triple (a?,b”,c?) is reflected by some remarkable local 
properties of fy» 4,-»- Indeed, they noted that 6.» 4...» can ramify only at 
2 and p, and that the ramification at p is rather mild (what Serre called 
peu ramifiée). But experience with galois representations shows that it is 
difficult to make large galois representations with so little ramification. As 
Serre conjectured and Ribet proved, the existence of such a modular galois 
representation has untenable consequences in the theory of modular forms. 
Fermat’s Last Theorem follows. 


§1. A Remarkable Elliptic Curve 


In this section we describe the crucial construction of an elliptic curve 
Ea? ,b cp out of a hypothetical solution of the Fermat equation a?+b?+c? = 
0. For any triple (A,B,C) of coprime integers satisfying A+ B+C = 
0, Gerhart Frey [8] considered the elliptic curve E'4,3,c defined by the 
Weierstrass equation 


Ea sc: y= x(x — A)(z + B) 


and explained some of the ways in which the arithmetic properties of 
Ea pc are related to the diophantine properties of the triple (A, B,C). 
Especially interesting are the connections with the Masser-Oesterle A-B-C 
conjecture and its generalizations. For a discussion of this line of thought 
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including connections with modular curves, we refer the reader to [7] and 
to Frey’s article in this volume (chapter XX). 

For our purposes it suffices to consider only the special case where 
(A, B, C) = (a?, bP, c?) corresponds to a hypothetical solution of the Fermat 
equation. Without loss of generality, we may assume a = —1 modulo 4 and 
2|b. It is not hard to calculate both the minimal discriminant Ag» 42-2 and 
the conductor No» pe,-e of the elliptic curve Ege pp \c?- 


(1.1) Proposition. Let p > 5 be prime and let a,b,c be coprime integers 
satisfying abc # 0, a = —1 modulo 4, 2|b, and a? +b? +c? = 0. Then 
Ear,pr,cp is a semistable elliptic curve whose minimal discriminant and 
conductor are given by the formulas 

(a) Aa? b,c? =2-8. (abc)??, and 

(b) Noa ,b>,c = Lejabe £. 


For definitions of semistability and of the conductor and minimal dis- 
criminant see Silverman’s article in this volume (chapter II, especially §14 
and §17). In general the primes dividing the minimal discriminant of an 
elliptic curve over Q are the same as those dividing the conductor and this 
might lead us to suspect that the discriminant and conductor should be 
close to one another. Indeed, Szpiro has formulated the following conjec- 
ture (see [19] where a slightly stronger form of the conjecture is formulated). 


Conjecture. (Szpiro) For any € > 0 there is a constant C > 0 such that 
the minimal discriminant Ag and conductor Ng of any elliptic curve Eq 
satisfy the inequality 

|Agl <C- Noe 


On the other hand, proposition 1.1 shows that a counterexample to 
FLT(p) for sufficiently large p gives rise to an elliptic curve whose minimal 
discriminant and conductor are so far apart that they would contradict 
Szpiro’s conjecture. We might thus hope to uncover a contradiction within 
the field of diophantine geometry. We will follow a different but related 
path and examine certain galois representations attached to Eg» 42 ce. 

The idea of using elliptic curves to study Fermat’s Last Theorem and 
vice versa goes back at least to the work of Y. Hellegouarch [9] (1972) who 
studied connections between the Fermat equation and torsion points on 
elliptic curves. Gerhart Frey seems to have been the first to suspect that a 
counterexample to Fermat’s Last Theorem might contradict the Modularity 
Conjecture and to investigate various approaches based on this idea. 


§2. Galois Representations 


In this section we collect the basic definitions and conventions from the 
theory of galois representations that we will need later. For more details 
we refer the reader to the article by Mazur in this volume (chapter VIII). 
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Let Q be the algebraic closure of Q in C. We endow the galois group 
Ga := Gal(Q/Q) with the Krull topology in which a basis of neighbor- 
hoods of the origin is given by the collection of subgroups H C Go of finite 
index in Gg. With this topology, Gq is a profinite group and in particular 
is a compact topological group. 

By a two dimensional galois representation over a topological ring A 
we mean a continuous group homomorphism 7 


Cg Gia: 


In this paper, the topological ring A will always be what Mazur calls a 
coefficient ring (in chapter VIII). Since these rings will play an important 
role in what follows, we make a formal definition. 


(2.1) Definition. A coefficient ring is a complete noetherian local ring 
with finite residue field of characteristic p (our fixed prime). 


Whenever we write that p : Gq —> GLo(A) is a galois representation, it 
is understood that A is a coefficient ring and that p is continuous. 


(2.2) Residual representations and deformations. Let A be a co- 
efficient ring with maximal ideal m, and let k4 := A/m, be the resid- 
ual field. We define the residual representation of a galois representation 
p: Gq —> GL2(A) to be the representation 


Pp: Go = GLo(ka) 


obtained by composing p with the reduction map GLe(A) —> Gla(ka). 
Conversely, if 9 : Gq —> Glo(k) is a two dimensional galois representa- 
tion over a finite field k, then we say that p is a lifting of pp to Aifk = ka 
and p = po. Two liftings p, p’ of po to A are said to be equivalent if p’ can 
be conjugated to p by a matrix in GL2(A) that is congruent to the identity 
matrix modulo mag. 

A deformation of po to A is an equivalence class of liftings of po to A. 
For a given lifting p of po, we will abuse notation and also write p to denote 
the deformation to which it belongs. This should not cause confusion in 
our discussion. 


(2.3) The determinant of a galois representation. If p is a two 
dimensional galois representation over A then 


det(p) : Gq — A* 
will denote the composition of p with the determinant homomorphism 
det : GL2(A) —> A”. 


In the applications it ig sometimes convenient to restrict our attention to 
representations with prescribed determinant. 
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For example, let xp : Gq —> Z? denote the cyclotomic character, 
which is characterized by the property o(¢) = ¢%(7) for any p-power root of 
unity ¢ and any o € Gq. Any coefficient ring A admits a unique continuous 
ring homomorphism Z, —~ A and we therefore have a canonical group 
homomorphism Z¥ —+ A”. We say that p has determinant x, if det(p) is 
the composition of xp, with the canonical homomorphism Z> —> A”. 


(2.4) Local galois groups. For each prime £, we let Q¢ denote the field 
of @-adic rationals, i.e., the completion of Q with respect to the é-adic 
absolute value | - |e. We fix once and for all an algebraic closure Q, of Qe 
as well as an embedding of Q into Q,. For £ = 00 we let Q., := R, the 
completion of Q with respect to the usual absolute value |-|,., and we take 
Q.,, := C. For each @ (2 prime, or 2 = 00), the local galois group at ¢ is the 
group 7 
Ga, = Gal(Q,/Qz). 


For £ = oo, we have 
Ga,, = Gal(C/R) = (0), 


the cyclic group of order 2 generated by complex conjugation c. It is well- 
known that for each there is a unique absolute value |-|z on Q, extending 
the given absolute value on Q,. From this it follows easily that the elements 
of Ga, are continuous automorphisms of Q,. 

Using our fixed embeddings Q C Q,, we may restrict any automor- 
phism of Q, to obtain an automorphism of Q. Since Q is dense in Q,, the 
induced homomorphisms Gq, — Gq are injective and we will regard them 
as inclusions: 

Ga, € Ga. 


These subgroups are often called the decomposition subgroups of Gq. Of 
course, strictly speaking, they are not well-defined since their definition 
depends on our choice of the fixed embeddings of Q into Q,. However, 
changing any one of these embeddings has the effect of conjugating the 
corresponding decomposition subgroup by an element of Gog. This ambi- 
guity will not be important to us. 


(2.5) Inertia groups. For ¢ # oo, Gq, preserves the ring Ze of integers 
in Q, and also preserves the maximal ideal 4 C Zp. Thus, Gq, acts 
naturally on the residual field Fy = Ze/r and we obtain a natural map 
Go, —> Gal(F,/F¢), which is easily seen to be surjective. Its kernel I, 
is called the inertia group at 2. Thus for each £ # oo, we have an exact 
sequence 

1— Ig ==! Ga, =F Gal(F,/F¢) —> 1. 


(2.6) Local properties of galois representations. Given a global galois 
representation p: Gq —> GLo(A), we may restrict p to the decomposition 


6 G. STEVENS 


groups Gq, and obtain the family {Plca, } of local galois representations 
plea, : Gas —* GL a(A). 


In many important examples from number theory one knows that the global 
representation p is determined up to isomorphism by the family of local 
representations {plgq, }egs, where £ ranges over the complement of any 
finite set S of primes. By the local properties at @ of a galois representation 
p we mean the properties of the local representation pl¢g,. The next three 
definitions describe three local properties that play a special role in what 
follows. 


(2.7) Definition. We say that p is odd if det p(c) = —1, where c is the 
complex conjugation generating Go... 


(2.8) Definition. We say that p is unramified at a prime @ if Ip C 
ker pla, - 


Since the galois group Gal(F2/F¢) is a topologically cyclic group gener- 
ated by the th power Frobenius automorphism Frob,, when p is unramified 
at £, pleg, may be viewed as a homomorphism Gal(F¢/Fe) —> GL2(A) 
and is thus determined by its value on any representative of Frobg in Gg,. 

When £ = p we need the following weaker condition. 


(2.9) Definition. We say that p is flat at p if, for every ideal J C A for 
which A/TI is finite, the representation Gg, —> GL2(A/J), obtained by 
reducing p|g,, mod J, extends to a finite flat group scheme over Zp (see 
Tate’s article in this volume (chapter V)). 


(2.10) Examples from number theory. The galois representations 
that arise naturally in number theory have the especially nice property of 
being unramified almost everywhere, that is, they are unramified at all but 
finitely many primes £. For example, let Eg be an elliptic curve. Then for 
each n > 0 the galois group Gq acts on the group E[p"] = (Z/p"Z)? of p™- 
torsion points on &. Since the action of Gg commutes with multiplication 
by p on E, Gg acts naturally on the Tate module 


Tap(E) := lim E[p"] = 22 
and we obtain the p-adic galois representation 
PE,p > Go — GL2(Z,) 


associated to H. The residual representation pg, : Gq —> Glo(F>). 
describes the action of Gg on E[p] = F%. We have the following basic 
result concerning the properties of these representations. 


(2.11) Theorem. Let pz be the p-adic galois representation associated 
to an elliptic curve E'yq and let Ng be the conductor of E. Then 
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- @ the determinant of pE,p 18 Xp, and 
® PE is unramified outside of pNz. 
In particular, pry is odd. If E 1s semistable with minimal discriminant 
Ag, then the residual representation pg, has the following local properties. 
eo IfL#p, then pg, is unramified at £ <> plorde(Azg). 
® Dg, is flat at p <=> piord,(Az). 


§3. A Remarkable Galois Representation 


Let E := Ege ypc» be as in §1 and consider the galois representation 
Pa? bcp : GQ —+ Glo(Fp) 


given by fy» o»,-7 = Pry. Gerhart Frey [7,8] and Jean-Pierre Serre [16] 
noted that this representation has some remarkable local properties. More 
precisely they proved the following theorem. 


(3.1) Theorem Let p > 5 be prime and a,b,c € Z satisfy a? +b? + cP =0 
and abc #0. Assume further that a = —1 modulo 4 and 2\b. Then 

(8) Par.pr¢p 18 absolutely irreducible; 

(b) Pap bp ,cP as odd; 

(C) Dapper 28 unramified outside 2p and is flat at p. 


One suspects that there are no galois representations py : Gq —> GLo(F,) 
satisfying properties (a), (b) and (c), but this suspicion remains unproven. 
On the other hand, by a theorem of Ribet, we do know that no such galois 
representation lives in the world of modular forms, in a sense that we will 
make precise in the next section. 


§4. Modular Galois Representations 


The theory of modular forms offers a rich source of galois representations. 
Using the Hecke operators, these “modular” galois representations can be 
constructed out of the torsion groups on the modular jacobians J\(N), 
N > 0 by the method of Eichler and Shimura. For an introduction to 
the theory of modular forms and the Eichler-Shimura theory, see David 
Rohrlich’s article in this volume (chapter III). 


(4.1) Galois representations associated to newforms. Fix, once and 
for all, a prime p of Q lying over p. Let f = 305 ng” be a weight two 
(normalized) newform of conductor N and character ¢ (in (3.5) of chap- 
ter III, newforms are called primitive forms). We let K+; be the completion 
at p of the number field generated by the values of € and the fourier co- 
efficients a, (n > 1), and we let O; C Ky be the ring of integers in K,. 
The theory of Eichler and Shimura associates to f an odd two dimensional 
galois representation 
ps: Gq —> Gla(Of) 
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such that for all sufficiently large primes £, p; is unramified at @ and 
Trace(ps(Frobe)) =ae and det(ps(Frobe)) = €(E)2. 


For the details of the Eichler-Shimura construction, we refer to section 3.7 
of Rohrlich’s chapter III in this volume, where ps appears as p). By the 
work of Carayol and others, we now have a good understanding of the local 
structure of py at all primes. In particular we know that py; is unramified 
outside pN and that the above conditions on the trace and determinant of 
p;(Frobe) are satisfied for these primes. 

By the work of Deligne [3] and Deligne-Serre [4], we know that similar 
assertions hold for newforms of any weight w > 1. Indeed, if f is a weight 
w newform of conductor N then Deligne has constructed an odd two di- 
mensional p-adic galois representation p;, which is unramified outside pN 
and satisfies Trace(;(Frobe)) = ag and det(p;(Frobg)) = e(2)£”—? for all 
£\pN. In this paper, we will be concerned almost exclusively with the case 
w= 2. 


(4.2) Hecke algebras. Let N > 0 be an integer and let S2(N) denote 
the space of weight 2 cusp forms for (NV) (see (3.2) of chapter III). We 
let 


be the Z-subalgebra of End(.S2(V)) generated by the Hecke operators Ty 
and the diamond operators (d) where @ runs over all primes not dividing 
pN, and d runs over (Z/NZ)* (see (3.3) of chapter III). 


(4.3) Modularity of galois representations. Motivated by (4.1) we 
say that a galois representation 


p: Gag — GlLo(A) 


over a coefficient ring A is modular if there exists an integer N > 0 anda 
homomorphism a : T’(N) —> Asuch that p is unramified outside Np and 
for every prime £ JpN we have 


Trace(p(Frobg)) = (Tz) and det(p(Frobg)) = m((2))é. 


Remark: In view of the above restriction on the determinant it might 
be more appropriate to call these modular representations of weight 2. 
However, since all of our representations will have weight 2, we will drop 
that modifier from our language. 


(4.4) Serre’s Conjectures. In the special case where A = k is a finite 
field, Serre {16] has formulated some precise conjectures about modularity 
of galois representations over k. One consequence of Serre’s conjectures is 
the following conjecture. 
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Conjecture. Every odd absolutely irreducible galois representation 
po: Gq —> GLa (k) 


is modular (in the sense of (4.3)). 


In fact, Serre’s conjectures are much more precise. They predict — 
in terms of the local structure of p — the optimal weight, conductor and 
character of a newform f for which 6; = po. For precise statements of 
Serre’s conjectures and an account of what is known about them today, 
see the article by Edixhoven in this volume (chapter VII). An important 
special case of these conjectures, which Serre called the epsilon conjecture 
in [16], is the following theorem of Ribet [13] (see §3 of chapter VII for a 
sketch of the proof). 


(4.5) Ribet’s Theorem. Let f be a weight two newform of conductor Né 
where £{N is a prime. Suppose p; is absolutely irreducible and that one of 
the following is true: 


® ps is unramified at l; or 
oe £=p and py; 1s flat at p. 


Then there is a weight two newform g of conductor N such that ps = py. 


§5. The Modularity Conjecture and Wiles’s Theorem 


We say that an elliptic curve Eyg is modular if there is a weight two 
newform f of conductor Ng and trivial character for which 


L(f,s) = L(E£,s). 


There are a number of equivalent ways of defining modularity of elliptic 
curves. Here are a few. 


(5.1) Theorem. The following assertions are equivalent for an elliptic 
curve Eyq. 


(a) E is modular; 

(b) for some prime p, pry is modular; 

(c) for every prime Dp, pxp is modular; 

(d) there is a non-constant morphism x : Xo(Nz) — E of algebraic 
curves defined over Q; 

(e) E is isogenous to the modular abelian variety As associated to some 
weight two newform f of conductor Nz. 


We have the following profound conjecture developed between 1957 
and 1967 by Shimura, Taniyama, and Weil. 


(5.2) The Modularity Conjecture. Every elliptic curve over Q is mod- 
ular. 
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The Modularity Conjecture is still open in general, but thanks to the work 
of Wiles [20] and Taylor—Wiles [18], we know at least that it is true for a 
large and important class of elliptic curves, namely the semistable ones. 


(5.3) Wiles’s Theorem. Every semistable elliptic curve over Q is mod- 
ular. 


We will sketch the proof in §7. In fact, by improving Wiles’s methods, 
Fred Diamond [5] has proven the much stronger result that every elliptic 
curve E7q that is semistable at 3 and 5 is modular. The proof is outlined 
in chapter XVII by Diamond. 


86. The proof of Fermat’s Last Theorem 


Returning to the situation of §1 and §3 we suppose p > 5 and assume 
a,b,c € Z satisfy a? + 6b? + cP =0 but abc £ 0. We derive a contradiction 
by the method described in [16] (see also [8]). Without loss of generality, 
we may assume a = —1 (mod 4) and 2|b. Let Ege pep be the elliptic curve 
y’ = x(x — a?)(xz + bP) and let par.ye.-p be the associated p-adic galois 
representation. 

By proposition 1.1, Ea» 4,-p is semistable and has conductor 


Nap bp ,cP = I £. 
Llabe 


Hence, by Wiles’s theorem, Hap y,-p is modular and there is a weight two 
newform fap be,ce Of conductor Ng» pe -p associated to Ear pe \-p. In partic- 
ular, we have Parsee = Pf.»4r,-p- But according to theorem 2.11 Dyo pp cp 
is absolutely irreducible and is unramified outside 2p and flat at p. Apply- 
ing Ribet’s Theorem we conclude that there is a weight two newform g of 
conductor 2 such that 6, = foe ye,» But the dimension of S2(I‘9(2)) is 
equal to the genus of X9(2), which is easily seen to be zero. Thus there 
are no weight two newforms of conductor 2. This is a contradiction and 
Fermat’s Last Theorem is proved. 


§7. The proof of Wiles’s Theorem 


In this final section, we describe the structure of the proof of Wiles’s The- 
orem [18,20]. For other surveys of the proof, we recommend [2,12,14,17]. 
Here we assume that the distinguished prime p is > 3. Let k be a finite 
field of characteristic p and let 


po: Gq — Glo(k) 


be a galois representation. As we move through this section we will impose 
a number of cumulative hypotheses on po. The first of these is the following. 


Hypothesis A. po has determinant xp. 
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(7.1) Semistable galois representations. We say that a galois repre- 


sentation 
p: Gq — GL2(A) 


is ordinary at p if the restriction of p to the inertia group J, at p has the 


form plz, = 6 ) for a suitable choice of basis. We say that p is 


semistable at a prime @ if one of the following two conditions is satisfied. 
e £=p and pis either flat at p or ordinary at p (or both). 


eo £Apand pl, = ( : ) for a suitable choice of basis. 


We say that a two dimensional galois representation p is semistable if it is 
semistable at every prime. From now on, we impose the following additional 
hypothesis on po. 


Hypothesis B. po is semistable. 


The use of the word semistable in this context is motivated by the 
simple fact that if H/q is a semistable elliptic curve, then the p-adic galois 
representation p&p : Gq —> GL2(Z,) is semistable in the above sense. 


(7.2) Deformation types. A deformation type D is a list of conditions 
to be imposed on deformations of a residual representation 


po: Gq — GLa(k). 


Using more sophisticated terminology, a deformation type may be regarded 
as a functor from the category of coefficient rings with residue field & to 
the category of sets, where, for a given coefficient ring A, D(A) is the 
set of deformations of pg to A that satisfy the conditions of D. For more 
discussion of deformation types we refer the reader to Mazur’s chapter VIII 
in this volume. 

Wiles considers a variety of different deformation types, but for the 
application to the semistable modularity conjecture it suffices to restrict 
to the following special cases. Let S := {2 # p| po is ramified at 2}. A 
‘deformation type PD is associated to a finite set of primes }p disjoint from 
S. We say that a deformation p of pg is of type D if the following conditions 
are satisfied. 


@ p has determinant yp, 

p is unramified outside SU{p}UEp, 

p is semistable outside Up, and 

if p g Np and if po is flat at p, then p is also flat at p. 


Roughly speaking, the last three conditions say that p has the same local 
properties as pg at primes not in Np. We remark that in any case, if po is 
ordinary at p then p is also ordinary at p. 


(7.3) Universal deformation rings and Hecke rings. In addition to 
hypotheses A and B above we suppose fp satisfies the following hypothesis. 
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Hypothesis C. po is absolutely irreducible. 


Using Mazur’s theory of deformations of galois representations [10], Wiles 
associates to each deformation type D a universal deformation ring Rp 
(which is, in particular, a coefficient ring) and a universal deformation 


PD: Goa —- GLo(Rp) 


of po of type D. The representation pp satisfies the following universal 
property: for every deformation p : Gq —> GL2(A) of po of type D there 
is a unique homomorphism 7,4 : Rp —> A such that the diagram 


Gy £2, Gi aR) 
pP™. 7 TA 
GL2(A) 


is commutative. For details on the properties and construction of Rp see 
chapter VIII by Mazur and chapter XIII by Brian Conrad. An explicit 
approach to constructing deformation rings is given in chapter IX by de 
Smit and Lenstra. 


Hypothesis D. pg is modular, and Pola, JS) is absolutely irreducible. 


Under this hypothesis, Wiles defines another coefficient ring Tp, the uni- 
versal modular deformation ring and a universal modular deformation 


PD,mod : GQ. — GLo(Tp) 


of po of type D. The representation pp,moa satisfies the analogous uni- 
versal property for modular deformations of type D. Namely, for every 
modular deformation p : Gq —> GL2(A) of po of type D there is a unique 
homomorphism 74 : Tp —> A such that the obvious diagram commutes. 

The constructions of Tp and pp.moa are quite difficult. The algebra 
Tp is defined in chapter XII by Diamond and Ribet. It’s existence depends 
on the highly non-trivial fact (described in chapter VII by Edixhoven) that 
there exists a weight two newform f such that py is a deformation of po 
of type D. The representation pp,moa iS cut out of the Tate module of 
a modular Jacobian using the Hecke operators. Wiles’s proof that this 
representation is a free rank two Tp-module depends on the Gorenstein 
property of Tp (see Tilouine’s chapter X in this volume). Later, other 
proofs of this fact were given that do not make explicit use of the Gorenstein 
property, but rather have the Gorenstein property as a by-product (for 
example, see [6]). 


(7.4) The main theorem. By the universal property of pp there is a 
unique homomorphism yp : Rp —> Tr such that pp moa = YDO PD. The 
following theorem is a special case of the main theorem of Wiles [20]. 
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Theorem. Suppose po satisfies hypotheses A-D. Then the canonical map 
yp: Rp — Tp is an isomorphism of complete intersection rings. 


For the definition of complete intersection rings, we refer to chapter XI by 
de Smit, Rubin, Schoof and in this volume. For our purposes what matters 
is the conclusion that yp is an isomorphism. The proof of the theorem 
is based on the numerical criterion of Wiles described in the next section, 
which reduces the proof to an inequality between two numbers. The theo- 
rem has the following important corollary as an immediate consequence. 


Corollary. Suppose po satisfies hypotheses A-D. Then every deformation 
of po of type D is modular. 


(7.5) Wiles’s numerical criterion. Let R and T be coefficient rings and 
suppose we have a commutative diagram 


R fae T 
TRY LUT 
O 


in which O is a complete discrete valuation ring and all the arrows are 
surjective. Let Ip := kertp, Ip := kermr, and let nr := wr(Annr(I7)). 
Then the following three assertions are equivalent. 

@ ~ is an isomorphism of complete intersection rings; 

@ Ip/I® is finite and #(Irn/I2) < #(O/nr); 

@ Ip/Ip is finite and #(In/IR) = #(O/nr)- 


This is a special case of Criterion I given in chapter IX by Schoof, Rubin, 
and de Smit. 


(7.6) Selmer groups and congruence modules. Now let f be a weight 
two newform and suppose p; : Gq —> Gle(O;) is a deformation of po 
of type D. By the universality of Tp there is a unique homomorphism 
Trp : Tp —> Of such that pj = Tr, ° PD,moa- Let TR, = Tr, ° YD SO 
that we have the following commutative diagram: 


Rp "2 Tp 
TR LTD 
O;. 


To prove that wp is an isomorphism, Wiles establishes the middle inequal- 
ity in the above criterion. For this, he first interprets the two sides of the 
inequality in terms of other objects that have been studied in some de- 
tail in the literature. More precisely, Wiles interprets the “tangent space” 
Homo (In, /I},,, K/O) as a Selmer group Hp (Ga, ad°(p;)®K/O), ie., as 
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a certain subgroup of the galois cohomology group H!(Gq, ad°(p;)@K/O) 
determined by local conditions associated to D, and he interprets O/nr, as 
a congruence module classifying congruences between f and other newforms 
of type D. For precise definitions, see sections 4.2 and 4.3 of chapter XII 
by Diamond and Ribet, chapter VIII by Mazur, and chapter IV by Wash- 
ington. The isomorphism between tangent spaces and Selmer groups is 
described in chapter VIII. 

The proof of the crucial numerical inequality divides into two parts. 
The case where Hp = @, which is called the minimal case, is proved by 
Wiles with Taylor in [18]. Their original proof has been simplified by 
making use of another criterion due to Faltings, a generalization of which 
is given as criterion II in chapter XI. This is the method followed by de 
Shalit in chapter XIV. The non-minimal case is proved by induction on the 
number of primes in Ep. The proof is accomplished by analyzing how the 
Selmer groups and congruence modules grow as }p is enlarged to conclude 
that if the numerical inequality is satisfied for one D then it is also satisfied 
when more primes are included in ©p. See chapter XII by Diamond and 
Ribet for more details. 


(7.7) The Proof of Wiles’s Theorem. We prepare for the proof by 
noting that hypotheses A and B are satisfied by Py for every prime p. 
Indeed hypothesis A is contained in theorem 2.11 and hypothesis B is a 
consequence of the semistability of F. 

Moreover, by a theorem of Serre ([15], prop. 21, and [17], §3.1), the 
semistability of H guarantees that pz, is either surjective or reducible 
for every prime p > 3. Hence for p > 3, absolute irreducibility of pz, is 
equivalent to irreducibility of pg,,, and if p = 3 this is equivalent to absolute 
irreducibility of Pr,aleg, Ja" Thus the following lemma is a consequence 


of corollary 7.4. 


(7.8) Lemma. Let E/q be a semistable elliptic curve and suppose pry 18 
both modular and irreducible for some prime p > 3. Then E is modular. 


Wiles gave an ingenious argument to show that for F semistable, the 
hypotheses of the lemma are satisfied by either p = 3 or p = 5. The proof 
is based on the following three theorems. 


(7.9) Theorem. Let E be an arbitrary elliptic curve and suppose pr 3 ts 
irreducible. Then pry 1s modular. 


This follows from a deep theorem of Langlands and Tunnell and de- 
pends in a crucial way on the theory of Langlands for GL2. For an exposi- 
tion of the Langlands theory and the proof of Theorem 7.9, see chapter VI 
by Stephen Gelbart in this volume. 

(7.10) Theorem. Let Eq be a semistable elliptic curve and suppose py 5 
is irreducible. Then there is another semistable elliptic curve E'g for which 
(a) Dy 3 18 irreducible, and 

(b) Pes =Pzs- 
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Indeed, proposition 11 and the argument in section 4 of Rubin’s chap- 
ter XVI in this volume provide us with a family of elliptic curves E'g 
satisfying conditions (a) and (b). All of these curves are semistable away 
from 5. By taking E’ in this family sufficiently close 5-adically to E, we 


obtain the desired semistable curve. 


(7.11) Theorem. Let Eq be a semistable elliptic curve. Then at least 
one of the representations pry OT Pgs ts irreducible. 


Indeed, if both pg 3 and pg; were reducible, then E[15] would contain 
a galois invariant subgroup of order 15. This contradicts Lemma 9 (iv) of 
chapter XVI by Karl Rubin (see also [11)). 


(7.12) Conclusion of the proof. Let E/q be a semistable elliptic curve. 
If Pz,3 is irreducible then, according to theorem 7.9, fg; is also modular, 
so E' is modular by lemma 7.8. If pg 3 is not irreducible, then by theorem 
7.11, Dz,5 is irreducible. Then there is another semistable elliptic curve Eg 
satisfying (a) and (b) of theorem 7.10. In particular, fg, is irreducible. 
Repeating the above argument we see that E’ is modular. Hence pz, 5 is 
modular and by (b) of 7.10, Hg 5 is modular. Once again we use lemma 7.8 
to conclude £ is modular. 
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A SURVEY OF THE ARITHMETIC 
THEORY OF ELLIPTIC CURVES 


JOSEPH H. SILVERMAN 


§1. BASIC DEFINITIONS 


An elliptic curve is a pair (F,O), where E is a smooth projective curve 
of genus one and O is a point of &. The elliptic curve is said to be defined 
over the field K if the underlying curve is defined over K and the point O 
is defined over K. 


Every elliptic curve can be embedding as a smooth cubic curve in P? 
given by an equation of the form 
(1) E:y? +a ry + a3y = 2? + agx” + aax + ag. 


Such an equation is called a Weterstrass equation for E. The point O is the 
point [0, 1,0] at infinity. If F is defined over K, then the a;’s can be chosen 
in K. If in addition char(K) # 2,3, then & has a Weierstrass equation of 
the form 


(2) E:y=2°+Act+B. 
The non-singularity assumption on & implies that the discriminant 
A = —16(4A° + 27B”) £0. 
We also define the j-invariant of E to be the quantity 
j(E) = -13t = 18 as 
(When using the general Weierstrass equation (1), the formulas for A and j 
are more complicated, see {10] or [8].) 


Theorem. Let E and E’ be elliptic curves defined over an algebraically 
closed field K. Then E is K-isomorphic to E’ if and only if j(£) = j(£’). 


Two special types of elliptic curves are those with j-invariant 0 and 1728. 
These curves are given by equations of the form 


E:y=2°+Ar  j=1728, 
E:y=23+B j=0. 
This survey summarizes, without proof, some of the basic theory of elliptic curves. 


Proofs for most of the theorems can be found in the references listed at the end, see 
especially [3], (8], and [9]. 
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§2. THE Group Law 


The points on an elliptic curve form a group. The group law can be 
characterized in a number of equivalent ways. Let E be an elliptic curve 
and P,Q € FE. The sum P+ Q is the (unique) point R satisfying 


(P) + (Q) ~ (R) + (0), 


where ~ denotes linear equivalence of divisors. Geometrically, three points 
sum to zero if and only if they are collinear. Using this geometric charac- 
terization, one can write down explicit formulas. For example, if P = (z, y) 
and P’ = (z’,y’) are on the curve given by the equation (2), then 


/ 


oP+P)=(Y=¥) 2a! and 2(2P) = 


zi —-2x 


z* — 2Axr? — 8Bxr + A? 
473 +4Ar+4B 


Similarly, the additive inverse of P = (x,y) is —P = (x, —y). 
Repeated addition gives multiplication maps 
P+P+---+P ifm > 0, 
[m]: E> E, [(m]JP =< O if m= 0, 
=(P+P+c+--+P) im <0. 
Further, for any point Q € EF, there is the translation-by-Q map 
Tg: E— E, Tg(P)=P+Q. 


Riemann-Roch tells us that an elliptic curve has a unique holomorphic 
differential (up to scalar). On the Weierstrass equations (1) and (2) it is 
given by 


dx 


and i spectivel 
= n WE = — re ively. 
2y +a 1x4 + a3 ze 2y : 7 


WE 


The uniqueness of we implies that it is translation invariant, 
TQ(wE) = WE for all Q € E. 
§3. SINGULAR CUBICS 


If the discriminant of a Weierstrass equation (1) or (2) vanishes, then the 
curve is singular, with exactly one singular point. There are two possible 
behaviors. Either the singular point has two distinct tangent directions (a 
node), or it has only a single tangent direction (a cusp). The non-singular 
locus is denoted 


E™={PeéE: P isa non-singular point of EF}. 
The group law described above makes the non-singular locus into a group: 


8 { the multiplicative group G,, if E has a node, 
~ | the additive group G, if E has a cusp. 
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§4. ISOGENIES 
A non-constant morphism ¢@ : E, — E> between elliptic curves which 
satisfies 6(O) = O is called an isogeny. 
Proposition. An isogeny @: E, — E> is always a group homomorphism. 
That is, 6(P + Q) = o(P) + o(Q). 


It follows that the kernel of an isogeny ¢: E, — Ez is a finite subgroup 
of FE. The degree of ¢ is its degree as a finite map of curves. (The constant 
map sending FE, to O is defined to have degree zero.) 

Associated to an isogeny @: E, — E» of degree n is a dual isogeny 


¢:E,> Ey 
characterized by the property that 
dog=([nlz, and ¢o¢=[nx,. 
The dual isogeny has the following additional properties: 


d=¢, b+b=06+0, do=hod, — [m|=[ml. 
§5. THE ENDOMORPHISM RING 
The set of isogenies from F to itself, together with the zero map, form 


a ring which we denote by End(F) and call the endomorphism ring of E. 
We make End(£) into a ring via the rules 


(+ ¥)(P)=6(P)+¥(P) and  (6b)(P) = o(¥(P)). 
The unit group of End(£) consists of the isomorphisms from F to itself. 
It is called the automorphism group of E and is denoted Aut(E). 


Theorem. Let E be an elliptic curve defined over a field K. 
(a) The endomorphism ring of E is one of the following three sorts of 
rings: 
Zz 
End(E) = ¢ an order in a quadratic imaginary field, 


? 


a maximal order in a quaternion algebra. 
The third possibility can only occur if char(K) > 0. 
(b) Assume char(K) #4 2,3. Then the automorphism group of E is given 
_ 2 if j(E) #0, 1728, 
Aut(E) = ¢ pa if j(£) = 1728, 
He if j(E) =0. 

(Here pn, is the group of n* roots of unity.) 

An elliptic curve whose endomorphism ring is strictly larger than Z is 


said to have complex multiplication (or CM for short). For example, the 
curves with 7 = 0 and 7 = 1728 have CM. 
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§6. TORSION POINTS 


The kernel of the multiplication-by-m map consists of the points whose 
order divides m. This subgroup is denoted 


E(m] = ker[m] = {P € E : [m]P = O}. 
The torsion subgroup of E is the set of all points of finite order, 
Etors = {P € E : [m|P =O for some m > 1} = U E{m]. 
m>1 


Remark. When we write E, E[m], Etors, etc., we are always referring to 
geometric points, that is, to points defined over an algebraically closed 
field. If F is défined over K and we want to discuss only the points defined 
over K, we will write E(K), E(K)[ml], and Ftors(K). 


Proposition. Let E/K be an elliptic curve. 
(a) Ifchar(K) =0 or if char(K) = p with p{m, then 


E{m] = Z/mZ x Z/mZ. 
(b) Jf char(K) = p>0, then 
Elp"| = Z/p"Z or 0. 


For a fixed prime £, consider the inverse system of &-power torsion points 
via the maps [é] : E[¢"*"] — E[é”]. The inverse limit is called the (¢-adic) 
Tate module of E and denoted 


T(E) = lim E[é"}. 
If char(K) # 2, then T;(£) is a free Ze-module of rank 2, 
Tr(E) & Ze X Ze. 
It is often more convenient to work with the Q,-vector space 
Ve(E) = Te(E) @ Q = Qe x Qe. 
§7. GALOIS REPRESENTATIONS ATTACHED TO E& 


If E is defined over K, then its torsion points are defined over the al- 
gebraic closure of K, and we can look at the associated Galois action. To 
simplify our exposition, we will always assume that 


K is a perfect field. 


We also fix an algebraic closure K of K. 
The action of Galois commutes with the group law on E£, so if char(K) = 
0 or if char(K) = p with p { m, then we obtain a two-dimensional repre- 
sentation 
pm: GRriK — Aut(E[m]) = GLo(Z/mZ). 
These representations are extremely important in studying the arithmetic 
properties of EF. 
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Proposition. The determinant det(f,,) of the representation pm is equal 
to the cyclotomic character 


Xm i ORK —* Aut( ttm) & (Z/mZ)*. 


The &-power representations gn fit together to give the ¢-adic represen- 
tation of E, 
Pe: GR/K —+ Aut (T,(E)) = GLe(Ze). 


The associated vector space representation is also denoted pz, 


Remark. The Tate module V;(£) is dual to the étale cohomology group 
H3,(E, Qe), so the associated representation can equally well be defined 
using cohomology. 


§8. THE WEIL PAIRING 
Let E'/K be an elliptic curve, and fix an integer m > 2. If char(K) > 0, 
we assume that it does not divide m. The Weil pairing is a pairing 
€m : E{m] x E[m] — ptm 


defined as follows: Let S,T € E[m]. Choose a function g on E whose 
divisor satisfies 


div(g) = [m]*(T) — [m]*(0). 
Then (X +5) 
GLA + 
ns, tf) =——_— 
ae) 9(X) 
for any point X € FE such that g is defined at X and at X +S. 
Proposition. The Weil pairing is 


Bilinear: €m(S, + S2,T) = em(S1,T)em(S2,T). 
€m(S,T1 + Te) = €m(S,T)em(S, To). 

Alternating: (a RES GD 

Non-degenerate: en(S,T) = 1 for al S => T =O. 


Galois Equivariant: ¢m(S°,T’)=em(S,T)? for alla € Gg x. 
Thus e,, induces an isomorphism 
Elm] A Elm] > bm 


of Galois modules. Let p : Gg;x — Aut(E[m]) be the Galois represen- 
tation attached to E, and let x : Gg/;~% — Aut(pm) be the cyclotomic 
character. Then with this identification, we have for any o € Gz /K: 


xX(o)(SAT) = (SAT)? = S° AT? = pm(o)S A pm(o)T, 
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which verifies the formula det(fm) = Xm as stated in §7. 
Let @: E, — Ep, be an isogeny. Then the dual isogeny @: By — FE, is 
dual (i.e., adjoint) with respect to the Weil pairing: 


em(S, $(T)) = em(o(S),T) for all S € Ey[m] and T € Ey[m]. 


The ¢-power Weil pairings egx fit together to define a bilinear, alternat- 
ing, non-degenerate, Galois equivariant pairing 


Cg: T)(£) x Te(£) —_ Te(4), 
where T¢(zt) = lim pega is the Tate module of the multiplicative group Grn. 


§9. ELLIPTIC CURVES OVER FINITE FIELDS 


Let E/F, be an elliptic curve defined over a field with g elements. Then 
the group of rational points E(F,) is a finite group. 


Theorem. (Hasse) 
la+1—#E(Fy)|<2VGa 
Ty ad aL ART 
Proof sketch. Let @¢: E — E be the Frobenius morphism given on Weier- 
strass coordinates by ¢(z, y) = (x%,y?)." Then E(F,) = ker(1—@). Further, 
one can show that the map 1 — @ is separable by looking at its action on 
the invariant differential, so 


#E(F,) = #ker(1— 9) = deg(1 — 9). 
We know that 


gog=degg=qeEZcEnd(£), 
and we let 
a=o+¢€ZC End(E). 


Then for any m,n € Z we have 
0 < deg(m +n) =(m + ng) 0 (m+ng) = m? + amn + qn’. 


The non-negativity implies that the quadratic form is positive semi-definite, 
so its discriminant is non-positive, a?7—4g <0. In particular, putting m = 1 
andn=-—lyields /-- 


} 7 ee ee es ie es 


, 

gt 

ae 

- gear foo 
© 


g 


z : #E(F,) = deg(1— ¢) =1-a+q, 


which combined with |a| < 2,/q gives the desired result. 
is ia (la £23 ew BO Bey bee, 
fat aay foes 2 aye Eg (iden, Le ane a 


qa é 
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Remark. Examining the above proof, we see that we have proven the fol- 
lowing fundamental formula for the sum of the g-power Frobenius map and 


its dual: 
multiplication 


o+¢=| byqt+1-#E(F))|. 
on E 


Hasse’s theorem says that the trace of Frobenius, that is @ + d, is an integer 
in End(£) of magnitude at most 2,/q. 

The zeta function of an elliptic curve E/F, is defined by the formal 
power series 


Z(E/F,,T) = on) #E(F.n) - =). 


Theorem. Let E/F, be an elliptic curve. The zeta function of E is a 
rational function of the form 


1— aT + qf? 
VACA | ieee iy ee emmancce i Sg 
la ak as cag CES) 
where a is the trace of Frobenius, 
(Oe as 
a=q+1—#E(Fy) = $+ 9. = 


Further, 


1—aT+qI? =(1—eT)(1—6T) with |o| = |6| = va. 
dhtipe oe tie a 
Isogenous elliptic curves have the same number of points, since if w : 
E = E’ is an isogeny defined over F,, then 


deg(b)(q + 1— #E(F,)) = deg() deg(1 — $x) 
= deg( — Wo oz) 
= deg( — dp OW) 
= deg(1 — ¢z7) deg(y) 
= (q+1— #E'(F,)) deg(s). 


The converse is also true, but harder to prove: 


Theorem. Two elliptic curves E/F, and E’/F, are isogenous over F, if 
and only if Z(E/F,,T) = Z(£'/Fy,T). 


For an elliptic curve over a finite field, the p-torsion, the Frobenius map, 
and the endomorphism ring are closely related. 
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Cou, ae o fewed es 
Theorem. Let E/ EF, be an elliptic curve, let p = char(F,) and let¢: E > 
E be the q**-power Frobenius map. The following are equivalent: 
() Hpl=0.  g- me 
(ii) The dual ¢ of mobenid is purely inseparable. 
(iii) The map [p]: E — E is purely inseparable. 
(iv) End(£) is an order in a quaternion algebra. 
If these conditions hold, we say that E is supersingular, otherwise we say 
that E is ordinary. If E is ordinary, then E[p| = Z/pZ and End(£) is an 
order in a quadratic imaginary field. 


The supersingular elliptic curves in characteristic p all have j-invariants 
lying in F 2. Up to F,-isomorphism, there are approximately p/12 of them. 
§10. ELLIPTIC CuRVES OvER C AND ELLIPTIC FUNCTIONS 


The complex analytic theory of elliptic curves is vast, so we will only 
hit on a few highlights. Let L Cc C be a lattice. An elliptic function is 
an L-periodic meromorphic function f(z), that is, f(z+w) = f(z) for all 
zéC and all w € L. The collection of all elliptic functions for L forms a 
field, denoted C(L). 


The Weierstrass g-function 
1 1 
p@)=maD-a+ DY (cya) 
we L, v0 (z—w) @ 


is an elliptic function with a double pole at each point of L and no other 
poles. Also associated to the lattice L are the Eisenstein series 


Gulb= + — 


weEL,w40 
These series are absolutely convergent for all integers k > 2. Notice 


that Go, bas the property the Go,(AL) = A~2* Gor (L) for any A € C*. 
It is standard to set 


g2(L) = 60G4(L) and = g3(L) = 140G(L). 


Theorem. (a) C(L) = C(g(z), 9’(z)). 
(b) The Weierstrass g-function and its derivative satisfy the identity 


g!(z)? = 4(z)* — g2(L)@(z) — g3(L). 


Further, the discriminant 


A(L) = ga(L)* — 2793(L)* 
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of the cubic polynomial is non-zero, so the equation 
Ez, :y? = 4x? — go(L)x — 93(L) 
defines an elliptic curve over C. 
(c) The map 
gr:C/L—E,(C), z— (9(z),@'(2)), 


is a complex analytic isomorphism of complex Lie groups. 
(d) Conversely, given any elliptic curve E/C, there exists a lattice L, 
unique up to homothety, such that Ey = E. 


Corollary. Let E/C be an elliptic curve and let m > 1 be an integer. 
Then as an abstract group, E[m] = Z/mZ x Z/mzZ. 
Proof. E(m] = ker(C/L 2=™, C/L) = (1/m)L/L & (Z/mZ)?. 
Another useful function is the Weierstrass o-function 
= = _ 2) p2/w+(1/2)(z/w)? 
a(z) =o0(2,L) =z II (a =) € : 
weEL,wH40 

It is a theta function and can be used to construct elliptic functions. For 
example, 


a(z+a)o(z—a) 


ofa) = 2 Fale = 4) 7 Ny _ _9(2z) 
(z) — (a) a(z)2a(a)? and = g'(z) 


a(z)* 
If E, and Ee are associated to the lattices DL, and Le respectively, then 
one can show that 


Hom(£), Fo) = {ae C : aly C Lp}, 
where the isogeny associated to a is given analytically by 
C/L, — C/Lz, Zr az. 
Using this, it is not bard to show that if LD = uw,Z+ w2Z, then either 
(i) End(E;) = Z, or 


(ii) Q(w1/w2) is a quadratic imaginary field, and End(#,) is isomorphic 
to an order in Q(w1/we). 


Homothetic lattices correspond to isomorphic elliptic curves, so it is 
common practice to use the normalized lattices 
L,=TZ+Z ~ with Im(r) > 0. 
One then writes o(z,7), o(z,7), Gox(r), etc. An elliptic function for L, is 
Z-periodic, and thus may be written as a function of 


Qriz 


u=e and g = e?™, 


This is equivalent to using the natural isomorphism 


C/L, > C*/q", Zi Agee. 
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Theorem. 


1 q’u 
wap = Dome te Lee 


neZ 
er eee ™u)(1 —g™u—') 
Sh Ne /26 miz(y _ 74 ; 
o(z,T) ORE ( ols (1 = gq”)? 7 


(Here n = n(r) is a complex number called a quasi-period of the lattice L,.) 


Finally, I want to mention the g-expansions for A and 7 and the Hisen- 
stein series Go,, and also to state Jacobi’s beautiful product formula for 
the discriminant function. 


Theorem. As functions of q = e?"*", the Eisenstein series Gox, the dis- 
criminant function A(r) and the j-invariant j(r) have the following ex- 
pansions in Z| q]: 


Gau(r) = 26(2h) +2, FE (So ah gt 


nm>1 din 


A(r) = (2m)? S70 r(n)q” = q — 24q? + 252q° — 147294 +--- 


n>1 


= 14S c(n)g™ =q7! +744 + 196884q + 21493760q" + --- 
n>o 


(Here €(s) is the Riemann zeta function.) 
The discriminant function also has the following product expansion: 


A(r) = A(L,) = (2m)'q [] (1 —9")*4. (Jacobi’s formula) 


n>1 


The integer coefficients 7(n) and c(n) of A and 7 have many wonderful 
arithmetic properties. 


§11. THE FORMAL GROUP OF AN ELLIPTIC CURVE 


Substituting x = z/w and y = —1/w into a Weierstrass equation for E 
gives 
w= 2 +a, zw + agz7w + a3w" + aazw- + agw?, 


and then repeated substitution (or Hensel’s lemma) can be used to ex- 
press w as a formal power series w(z) € Z[ai,... , @][z]. This in turn can 
be used to express x, y, and the invariant differential w, as formal series 
in z, and then the group law is given by a power series F'g(z), 22) in two 
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variables. The first few terms of these Series are: 


w(z) = 23 + a,2* + (a? +.a2)z° + (a3 + 2aiag +43)2°+---, 

a(z) = 27? — a,z_) — ag — a3z — (a4 +.0103)z7 —---, 

y(z) = —273 +ayz~* +4227) +43 + (a4 +a103)2—-*-, 
wp(z) = (1 + a1z + (at + a2)z7 + (a? + 2a, a9 + 2ag)z9 + ---)dz, 


Fp(21, 22) = 21 + 22 — A121 22 — a2(ziz0 2 2Z5) pees, 


The formal group E associated to E is the formal group defined by the 
formal group law Fe(z1, 22) € Zlay,..- , a6) 21, 22]. 

Let R be a complete local ring with maximal ideal p, and suppose that 
the a;’s are in R. Then F'g converges for z,,z2 € p and gives p a group 
structure which we denote by E(p). The series Fg also induces a group 
structure on the powers p’, which gives E(p) a natural filtration E(p"). 
The following is a general property of formal groups. 


Proposition. The group E(p) has no prime-to-p torsion. In other words, 
if m #0 (mod p), then E(p) has no non-trivial points of order m. 


812. ELLIPTIC CURVES OVER LOCAL FIELDS 


For this section we set the following notation: 
K a complete local field with normalized valuation uv: K* — Z. 
R the ring of integers of K. 
p the maximal ideal of R. 


k the residue field k = R/p. 
A minimal Weierstrass equation for an elliptic curve E/K is a Weier- 
strass equation 


E:y*? +a ,zcy +ag3y = 2° + agz7 + ast + a6 


with a; € R and v(A) minimized. If char(k) 4 2,3, then EF always has a 
minimal equation with a, = ag = a3 = 0. 

The reduction of E modulo p, denoted F, is then the curve over k defined 
by the equation 


E:y? +a@2y + Ggy = x? + Gox” + Gar + Gg, 


where the tilde denotes reduction modulo p. The curve F may be singular: 
its non-singular part is denoted E™*. We say that 

E has good (or stable) reduction if E is non-singular. 

E has multiplicative (or semi-stable) reduction if E has a node. The re- 
duction is called split if the tangent directions are defined over k, otherwise 
it is non-split. 
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E has additive (or unstable) reduction if E has a cusp. 


Remark. It is becoming common to use the term “semi-stable” to refer to 
an elliptic curve which has either good or multiplicative reduction, while 
“unstable” retains its meaning of additive reduction. 


Proposition. Let E/K be an elliptic curve. Then there is a finite ez- 
tension K’/K such that E has either good or split multiplicative reduction 
over K’. 


We define a filtration on E(.K) by 


Fo(K) ={P € E(K) : P € E™5(k)} 
F,(K) ={P € E(K) : P=0} 
E,(K&) ={P € E(K) : v(a(P)) < —-2r}) (for r > 1). 


Proposition. (a) There is an exact sequence 
0 — E,\(K) —> Eo(K) — E(k) — 0. 


(b) There is an isomorphism E,(K) & E(p). This isomorphism identifies 
E,(K) with E(p"). 

(c) The quotient group E(K)/Eo(K) is finite. More precisely, it has or- 
der 1, 2, 3, or 4 unless E has split multiplicative reduction, in which case 
it is a cyclic group of order v(A). 


Remark. Another description of the group E(K)/Eo(K) is that it is iso- 
morphic to the group of components of the Néron model of EF over R. 
The following corollary is of fundamental importance. 


Corollary. If E has good reduction at p and m is relatively prime to 
char(k), then the reduction map 


E(K)[m] — E(k) 


is injective. Equivalently, the extension K(E[m]) generated by the m- 
torsion points is an unramified extension of K. 


The following converse is often useful. Let Iz), denote the inertia sub- 
group of Gz /x, and recall that a Gz,,~-module M is said to be unramified 
if Ig ;~ acts trivially on M. 


Theorem. (Criterion of Néron-Ogg-Shafarevich) The following are equiv- 
alent: 

(i) E has good reduction. 

(ii) E[m] ts unramified for infinitely many m prime to char(k). 

(iii) T,(E) is unramified for some £ # char(k). 
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Corollary. If E,/K and E,/K are isogenous over K, then they either 
both have good reduction, or neither has good reduction. 


An elliptic curve E/K is said to have potential good reduction if it ac- 
quires good reduction over a finite extension of K. 


Proposition. An elliptic curve E/K has potential good reduction if and 
only if j(E) € R. 


§13. THE SELMER AND SHAFAREVICH-TATE GROUPS 


For this section we fix the following notation: 
K a number field. 
R the ring of integers of K. 
For any place v of K, we write K, for the completion of K with respect 
to v. If uv is non-archimedean, we write Ry, py, and k, for the ring of 
integers of K,, maximal ideal of R,, and residue field of R, respectively. 


Mordell-Weil Theorem. Let E'/K be an elliptic curve. Then the group 
of rational points E(K) is a finitely generated abelian group. 

In this section we will consider a weak form of the Mordell-Weil theo- 
rem which asserts that the quotient group E(K)/mE(K) is finite. This 
assertion is one of the main ingredients in the proof of the full theorem. 

Fix an integer m > 2 and consider the exact sequence 


0 opp es ee) Ss 


Taking Galois cohomology gives the long exact sequence 


E(K) ——— 0. 


+ E(K) 9 B(K) > H'(Gx,/x, Elm) 


= "(Gz x, E(4)) 5 H* (Gg), E(K)) =25 
and this in turn gives the Kummer sequence for E'/K, 
0 E(K)/mE(K) > H*(Gz/x, E[m)) > H*(Gx/x, E(K))[m] - 0. 


Unfortunately, the group H'(Gxz/~, E[m]) need not be finite. However, 
any element of H'(Gx/x%,E|m]) which comes from a point of E(K) will 
necessarily come from a point in E(K,,) for every completion of K. In other 
words, if we consider the Kummer sequence for E'/K, and restriction maps 
on cohomology, we get a commutative diagram 
9 ~ HGepx Bim) + HG R/x,B(R))lm] 0 


| l ! 


0— Alea (K,) =e (Gz, Ky» ml) TL (Gx, /x,.E(Ky))[m]—0 
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This suggests the following definitions: The m-Selmer group of E/K is 
the group 


50) (E/K) = kerf H*(Gg,x, Elm) > |[ #'(Gx,)x,: E(K,))}. 


The Shafarevich-Tate group of E:/K is the group 


I(E/K) = ker{ H"(Gx/x,E(K)) aed [[ Gx. x, E(%))}- 


It is immediate from these definitions that there is an exact sequence 
0 — E(K)/mE(K) — 8° (E/K) — Il(E/K){m] = 0. 


Theorem. The Selmer group S‘™)(E/K) is finite. Hence E(K)/mE(K) 
and Il(E/K)|m] are also finite. 


Proof sketch. Let p be a prime of K not dividing m for which F has good 
reduction. Then E[m] — E(k,) (i-e., the m-torsion injects into the reduc- 
tion modulo p). This implies that any cocycle in $(™(E/K) is unramified 
at p, so S(™(E/K) consists of cocycles which are unramified outside a 
finite set of primes, specifically outside the set 


{p : E bas bad reduction at p} U{p : p divides m}. 


Finally, it is an elementary consequence of Dirichlet’s unit theorem and 
the finiteness of the class group that for any finite Gz,~-module M and 
any finite set of places S, the set of cocycles in H'(Gz/~,M) unramified 
outside S is finite. 


Remark. More generally, if @¢ : E — E’ is an isogeny of elliptic curves 
defined over K, there is an associated Kummer sequence 


0 E(K)/¢(E(K)) > B'(Gx/x, Elgl) > H*(Gx/x, E(K))[9] > 0. 


Using this, one defines in an analogous fashion the o-Selmer group, denoted 
S‘*)(E/K), which can be shown to be finite, and an associated exact se- 
quence 


0 > E'(K)/¢(E(K)) — S®(E/K) = Il(E/K){¢] > 0. 


The group H'(Gx/x,E(K)) can also be interpreted as the collection 
of homogeneous spaces of E'/K. Generally, one defines the Weil-Chatelet 
group of E/K to bet 

WC(E/K) = K-isomorphism classes of smooth projective curves 

~ \C/K such that C is isomorphic to E over K ; 
tThis is cheating a little bit. The Weil-Chatelet group is actually the group of 
principal homogeneous spaces for E/K. That is, an element of WC(E/K) consists of 
a curve C/K and a simply transitive algebraic group action of F on C defined over K. 


Further, in defining the associated cocycle, we need to choose an isomorphism f :C — E 
with the property that f° o f~! is a pure translation map on E. 
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A homogeneous space C/K represents the zero element of WC(E/K) if 
and only if C(.A’) is non-empty. 

There is a natural isomorphism WC(E/K) = H*(Gxz/x, E(K)) defined 
in the following way. Let [C/K] € WC(E£/K) and choose an isomorphism 
f:C — E defined over K and a point P € C(K). Then the cocycle 


Gzjx — E(K), Cis I(F*) = FP); 


represents the cohomology class in H'(Gz/%,E(K)) associated to C/K. 
With this identification, the subgroup IN(£/K) in WC(E/K) consists of 
all homogeneous spaces C'/K such that C(K,) is non-empty for all places uv 
of K. 


Remark. Each Selmer group S‘™)(E/K) is effectively computable in the- 
ory, and frequently computable in practice. At present, there is no proven 
effective method for determining which part of S(™(E/K) comes from 
E(K)/mE(&) and which part comes from II(E/K). 


§14. DISCRIMINANTS, CONDUCTORS, AND L-SERIES 


Let K be a number field and E/K an elliptic curve. For each prime p 
of K we can consider a minimal Weierstrass equation for the local field K, 
and the discriminant A, of this minimal equation. The minimal discrimi- 
nant of E'/K is the integral ideal 


DeysK _ I] e?¢. 
p 


If K has class number one (e.g., K = Q), it is possible to find a Weierstrass 
equation 
E:y? +a,cy +agy = 2° + agz” + ast +46 

which is simultaneously minimal at all primes of K. The discriminant A of 
this global minimal Weierstrass equation is then equal to the discriminant 
of E/K (and is uniquely determined up to multiplication by the 12'*-power 
of a unit.) 

The minimal discriminant is a measure of the bad reduction of E. An- 
other such measure is the conductor of E/K. This is an ideal 


NeyK = [[e??™, 
p 
where the exponents f,(E/K) are given by 
0 if E bas good reduction at p, 


fp(E/K) = 4 1 if E has multiplicative reduction at p, 
2 if E bas additive reduction at p and p{ 6. 
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If p bas residue characteristic 2 or 3 and EF has additive reduction at p, then 
the exponent of the conductor is equal to 2+6,, where 6, is a measure of the 
wild ramification in the extensions K,(E[m])/K,. Over Q, for example, 
the conductor exponents are bounded by 


fs<5 and fg <8. 


Even in characteristics 2 and 3, the conductor can easily be computed using 
an algorithm of Tate and a formula of Ogg and Saito. 


Remark. If E bas everywhere semi-stable (i.e., good or multiplicative) re- 
duction, then its conductor is simply the product of its primes of bad 
reduction. 

“For each prime p of K, let gp be the norm of p. If & has good reduction 
at p, we also let 7 

Op = Jp +1 — H#E(Kp). 
The local factor of the L-series of E at p is the polynomial 
1—@,pT + pT? if E bas good reduction at p, 


1-T if E has split multiplicative reduction at p, 
L(T)=9 1 +T if # bas non-split multiplicative reduction 
at p, 
1 if F has additive reduction at p. 


In all cases the relation 


Ly (1/9) = #E™ (kp) /Qp 


holds. The global (or Hasse- Weil) L-series of E/K is then defined by the 
Euler product 


L(E/K, 8) = I] 20 (@°) 


It is not hard to prove that isogenous curves have the same L-series. The 
following converse is a consequence of (and in fact equivalent to) Faltings’ 
isogeny theorem. 


Theorem. Two elliptic curves E/K and E’/K are isogenous over K if 
and only if ay(E) = a,(E’) for all (or all but finitely many, or even all but 
a set of density zero) primes p of K. 


Remark. Over Q it is even true that E/Q and E’/Q are isogeneous if and 
only if L(E/Q,s) = L(E’/Q,s), but this need not be true over number 
fields. An example, given in [7, remark 3.4], is K = Q(z) and 


Bt: y=22+i2 +3. 


In this example, E+ and E~ are not isogeneous, but L(E*,s) = L(E~,s), 
since if we write Gx,/g = {1,c}, then a,(E*) = ape (E7). 

The estimate |a,| < 2q, a implies that the Euler product converges and 
gives an analytic funeeon | in the half-plane Re(s) > 3/2. 
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Conjecture. The L-series L(E/K,s) has an analytic continuation to the 
entire complex plane and satisfies a functional equation relating its values 
at s and2—s. 


Over Q, the conjecture asserts that the function 
€(E/Q, 8) = Ngli, (2m) *0(s)L(E/Q, 8) 
has an analytic continuation and satisfies the functional equation 
€(E/Q, 2—s) = +€(E/Q8). 


This is known to be true for modular elliptic curves. 


§15. DUALITY THEORY 


There are both local and global duality theorems for the cohomology of 
an elliptic curve. 


Local Duality Theorem. (Tate) Let K be a complete local field and let 
E/K be an elliptic curve. There is a bilinear, non-degenerate pairing 
E(K) x H"(Gz/x, E(K)) — Q/Z. 

More precisely, the pairing induces a duality of locally compact groups, 
where E(K) is given the topology induced by the topology on K, and where 
the cohomology group H!(Gz /K, E(K)) is given the discrete topology. 

Here is one of the many equivalent definitions of the Tate pairing. Let 
Pe E(K) and€€ A (Gx x, E(K)). Take any integer m > 1 which 
kills € and consider the short exact sequence 

0 E(K)/mE(K) > H'(Gx x, Elm)) > H'(GRx/x, E(K)){m] 0. 

First we push P forward to get an element 6P € H'(Gx/x,E[m]). Next we 
choose an element 7 € H'(Gz/x,E|m]) which maps to €. Then the cup 
product 6P U7 is in H?(Gz/x,E[m]® E[m]). Finally we use the Weil 
pairing e,, : E[m] @ E[m] — tm to get the desired cohomology class 
Note that the last isomorphism is the identification of the Brauer group 
of K with Q/Z provided by local class field theory. 

The global duality theorem is only fully satisfactory when II is known 
to be finite. 
Global Duality Theorem. (Cassels) Let K be a number field and let 
E/K be an elliptic curve. There is an alternating bilinear pairing 

I(#/K) x W(z/K) — Q/Z 

whose kernel on either side is the group of divisible elements of IN(E/K). 
In particular, if W(E/K) is finite, then the pairing is perfect and the order 
of Il(E/K) is a perfect square. 

The definition of the pairing on II is considerably more complicated, so 
we do not give it here. 
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816. RATIONAL TORSION AND THE IMAGE OF GALOIS 


Let E/K be an elliptic curve defined over a number field. The ¢-adic 
representation 


pe: GRyK — Aut(Ty(E£)) = GLo(Ze) 


determines many of the arithmetic properties of E. If E has complex 
multiplication, pg can be described in terms of class field theory. The 
following two important results give a further description of pg. 


Theorem. (Serre) Assume the E does not have complex multiplication. 
(a) The image of pe is of finite index in GLo(Ze) for all primes £. 
(b) The image of pe is equal to GLe(Ze) for all but finitely many primes £. 


Theorem. (Faltings) Let E/K and E’/K be elliptic curves. Then the 
natural map 


Homy (E, E’) ® Zp —> Hom(T¢(E), Te(E’))°%/*« 


1s an 1somorphism. 


It is conjectured that the total index of the pg’s is bounded independently 
of the curve &. That is, for a fixed number field K and any non-CM elliptic 
curve E/K, the quantity 


II [GL2(Ze) : pe(Gz/x)| 


£ 


is bounded by a number depending only on K. In particular, the tor- 
sion subgroup E(K)tors Should be bounded independently of E. This last 
statement has recently been proven. 


Theorem. (a) (Mazur) Let E/Q be an elliptic curve. Then E(Q)tors is 
one of the following 15 groups: 


Z/nZ with 1<n<10 orn=12, or 
Z/2Zx Z/2nZ withi<n<4. 


(b) (Kamienny, Mazur, Merel) Let K be a number field of degree d. Then 
there is a constant c(d) so that for every elliptic curve E/K, the torsion 
subgroup of E/K satisfies #E(K )tors < c(d). 


§17. TATE CURVES 


Let K be a local field which is complete with respect to a non-archime- 
dean absolute value | - |,. The analytic parametrization C/L — E(C) of 
an elliptic curve over C does not have a direct non-archimedean analogue, 
because K has no discrete subgroups. However, the situation changes when 


A SURVEY OF THE ARITHMETIC THEORY OF ELLIPTIC CURVES 35 


one considers C*/q, since any q € K* with |q|, < 1 will generate a discrete 
subgroup. It turns out that suitably normalized g-expansions of g, 9’ 
and Go, give a v-adic analytic isomorphism from K*/gq” to an elliptic 
curve E, defined over K. However, not all elliptic curves over K arise in 
this fashion, as can be seen by examining the j-invariant 


j(Eq) = 5(q) = 71 + 744 + 196884q + 21493760q? +---. 
It is clear that |j(q)|, > 1, and so E, must have multiplicative reduction. 


Theorem. (Tate) Let g € K* with |q|, <1. There is an elliptic curve 
E,/K and a Gg /K-equivariant v-analytic isomorphism 


o: K*/q" — E,(K). 


The set of curves {E, : q € K*, |qly < 1} is exactly the set of elliptic 
curves over K with split multiplicative reduction. 


If E/K satisfies |j(EF)|, > 1 but does not have split multiplicative re- 
duction, then it is isomorphic over K to some Eq. More precisely, there is 
a unique quadratic extension L/K such that E is isomorphic to E, over L, 
and then 

E(K)2{ueL*: Nz/x(u) E q@} / ae: 
Further, the extension L/K is unramified if and only if E bas non-split 
multiplicative reduction. 


§18. HEIGHTS AND DESCENT 


Let K be a number field and let Mx be the set of inequivalent absolute 
values on K, suitably normalized. The height on P” is the function 


. pr = 
h:P"(K) — [0,c0), —A([z0,--- , tal) = 2s log TOE |i: 
With the appropriate normalization, the height is independent of the choice 
of homogeneous coordinates and of the field K. For this reason, h is often 
called the absolute logarithmic height. The height on an elliptic curve E/K 
given by a Weierstrass equation is 


h: E(K) (0,00), ACP) = h([xp, 1). 


Proposition. The height on an elliptic curve E/K has the following prop- 
erties: 

(i) h(mP) = m7h(P) +O(1) for all P € E(K). : 
(ii) A(P + Q)+h(P — Q) = 2h(P) + 2h(Q) + O(1) for all P,Q € E(K). 
(iii) For any H, the set {P € E(K) : h(P) < H} is finite. 

(The O(1) constants depend on E and, in (i), also on m.) 


The canonical (or Néron-Tate) height on E/K is defined by the limit 
h: E(K) = (0,00), A(P) = lim 4-*A(2"P). 
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Theorem. The canonical height is a positive semi-definite quadratic form 
on E(K) with the following properties: 


(i) A(P) = A(P) + O(1) for all P € E(K). 

(ii) A(P) =0 df and only if P € Evors- 

Further, h extends R-linearly to give a positive definite quadratic form on 
the vector space E(K) @R. 


Using the canonical height, it is easy to complete the proof of the 
Mordell-Weil theorem. 


Proof (of the Mordell- Weil theorem). The weak Mordell-Weil theorem says 
that E(K)/mE(K) is fitiite, so let P,,... ,P, € E(K) be coset represen- 
tatives. Let H = max h(F;). I claim that the set 


S={PéE(K) : h(P)<H} 


is a generating set for E(K). Note this set is finite, since h=h+ O(1). 
Suppose that it does not generate. Let Q € E(K) be a point of minimal 
canonical height not in the span of S. By assumption, Q = P;+mR for 
some 7 and some R € E(K). Then R cannot be in the span of S, so 


h(Q) < A(R) = —A(mR) = = 4(Q - P) 


23 , 23 
< — (h(Q) + h(P)) < 5 (h(Q) + H). 
This implies that 

h(Q) < eee: < A, 


m—-2° — 
which says that Q € S. This contradiction completes the proof. 


The bilinear form associated to the canonical height is denoted 


(P,Q) n = =(A(P + Q) —A(P) — A(Q)). 


No} re 


Using this, the elliptic regulator of E/K is defined to be the quantity 


R(E/K) = det((P;, P;)z) 


1<t,j<r’ 


where P,,... ,P, is a basis for E(K)/E(K)tors. The elliptic regulator sat- 
isfies R(E/K) > 0. 
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§19. THE CONJECTURE OF BIRCH AND SWINNERTON-DYER 


The conjecture of Birch and Swinnerton-Dyer relates the L-series of an 
elliptic curve to many of its other arithmetic invariants. For simplicity, we 
will restrict ourselves to K = Q. Let E/Q be an elliptic curve, and let 


0. = | isi 
E(R) 


where w is the invariant differential on a minimal Weierstrass equation. 
Further, for each prime p, let 


OQ, = #E(Qp)/Eo (Qp). 


(Thus if E has good reduction at p, then Q, = 1. It is possible to ex- 
press the 2's as the values of p-adic integrals, very much analogous to the 
archimedean integral defining 10...) 


Conjecture of Birch and Swinnerton-Dyer. Let E'/Q be an elliptic 
curve. 


(a) ord L(E/Q, s) = rank E(Q). 
(b) Letr=rank E(Q). Then 


_ L(E/Q,s) _ R(E/Q) - #0 (E/Q) 
oo (s—1)r re I1% (#E(Q)tors)” 


§20. COMPLEX MULTIPLICATION 


Recall that an elliptic curve FE’ is said to have compler multiplication if 
its endomorphism ring End(£) is strictly larger than Z. If this happens, 
then the algebra K = End(E) @ Q is a quadratic imaginary field and 
R = End(£F) is an order in K. Fix a Weierstrass equation for E’ of the 
form 


E:y?=2°+Azr+B — with discriminant A = —16(4A° + 27B”) 40, 
and define the Weber function on E to be the function 


(AB/A)x(P)_ if j(E) 40,1728, 
on(P) =< (A?/A)z(P)? if j(E) = 1728, 
(B/A)z(P)> if j(Z) =0. 


(One can check that @z does not depend on the choice of Weierstrass 
equation.) 


38 J. H. SILVERMAN 


Theorem. With notation as above, suppose that R is the full ring of in- 
tegers of K. 

(a) The j-invariant 7(E) is an algebraic integer. 

(b) The field H = K(j(E)) is the Hilbert class field of K (i.e., H is the 
maximal abelian unramified extension of K ). 

(c) The field H({@x(T) : T € Etors}) is the maximal abelian exten- 
sion K®> of K. : 


It is possible to describe the action of Gyab;~ on the numbers $z(T) 
via the Artin map, although this is most efficiently done using an adelic 
formulation. We will be content to describe the action on j(#). For each 
prime ideal p of K, let F, € Gy/x be the Frobenius element corresponding 
to p. Further, choose a lattice L C C so that there is a analytic isomorphism 
C/L = E(C), and define a new elliptic curve p * E to be the elliptic curve 
corresponding to the lattice p~'L. Then the action of Gy; on j(E) is 
determined by the relation 


j(E)*? = j(p * E). 


Associated to an elliptic curve E/F with complex multiplication by the 
full ring of integers of K is a Gréssencharacter 


w B/F : AP — k* 
roughly determined by the condition that for each prime 8 of F’, the map 
beyr(P)]: EE 


reduces modulo $$ to the $8-Frobenius map on the reduced curve FE. We 
also recall that to any Gréssencharacter 7 : AZ — K™* is attached the 


Hecke L-series 
L(s,b) = [leone a 


which has an analytic continuation to all of C and satisfies a functional 
equation. 


Theorem. (Deuring) Let E/F be an elliptic curve with complez multipli- 
cation by the full ring of integers of K. 
(a) If K CF, then 
L(E/F,s) = Us, ber) Ls, be/F)- 
(b) KG F, let F’=FK. Then 


L(E/F, s) = L(s, be/r’). 
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§21. INTEGRAL POINTS 


Let K be a number field, let S be a finite set of places of K including 
all arcbimedean places, and let Rs be the ring of S-integers of K. Let 


E: y" +aicy + a3y = 2° + age” + a4 t+ ag 
be a Weierstrass equation for E'/K with integral coefficients 
@1,@2,43,04,a6 € Rg, 
and consider the set of S-integral points on E, 
E(Rs) = {P= (t,y) € B(K): z,ye€ Rs}. 


More generally, we can look at S-integral points relative to an arbitrary 
coordinate function on &. A fundamental theorem of Siegel says that such 
sets are finite. 


Theorem. (Siegel) For any non-constant function f € K(E), the set of 
S-integral points of E relative to f, 
E;(Rs)={Pe€ E(K): f(P)€ Rs}, 


is a finite set. 


Siegel actually proves a more precise statement. To avoid introducing 
too much notation, we will only describe it for K = Q and Rs = Z 


Theorem. (Siegel) Let E/Q be an elliptic curve, let f € Q(E) be a non- 
constant function, and for each point P € E(Q), write 


f(P) =ap/bp with ap, bp € Z and gcd(ap, bp) = 1. 
(If f(P) = co, setap = 1 and bp =0.) Then 
log|ap| _ 


m = 
PeE(Q) log|bp| 
h(f(P))-00 


Siegel’s theorems use methods from the theory of Diophantine approx- 
imation and are not effective. Baker used his results on linear forms in 
logarithms to give effective bounds for the size of integral points on elliptic 
curves. These bounds have been improved over the years, but are still quite 
large. 

Shafarevich used the finiteness of S-integral points on the curve y? = 
z*® + D to prove the following finiteness theorem for elliptic curves with 
prescribed bad reduction. 
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Theorem. Fiz a finite set S of primes of K. Then there are only finitely 
many K-isomorphism classes of elliptic curves E/K which have good re- 
duction at all primes not in S. 


Faltings subsequently proved that the same result is true for curves of 
any fixed genus and for abelian varieties of any fixed dimension. 
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These notes on Eichler-Shimura theory are intended for a reader who is 
familiar with elliptic curves and perhaps slightly acquainted with modular 
forms. The primary sources are [8], [19], and [20]. I am deeply indebted to 
Jaap Top for taking my place at the conference on very short notice and 
to Glenn Stevens for making the necessary arrangements with tact and 
understanding. I am also grateful to both of them for a careful reading of 
the text and for several comments which improved the final version. 


1. MODULAR CURVES 


Throughout, the term “curve” will mean “absolutely irreducible variety 
of dimension one.” If k is a field, then k(t) denotes the field of rational 
functions over k. 


1.1. The modular curve Xo(N). Let N be a positive integer. The 
modular curve Xo(V) may be defined as follows. First choose an elliptic 
curve £ over Q(t) such that 7(£) =t. Then choose a point of order N on 


FE and let C be the cyclic group which it generates. The subfield of Q(t) 
fixed by the group 


{o € Gal(Q(t)/Q(t)) : o(C) = C} 


is a finite extension K of Q(t), and it turns out that K contains no proper 
algebraic extension of Q: in other words, if we think of Q as the algebraic 
closure of Q inside an algebraic closure of K, then QN K = Q. It follows 
that K is the function field of a smooth projective curve over Q; this is 
X(N). 

The simplest nonvacuous example is the case N = 2. Let us choose E 


to be the curve 
27t 27t 


2 _. SoS er oo SR 
yo = 4a" — 7798" ~ 71798" 
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so that K is the extension of Q(t) generated by a root of the equation 


2 Eg 
4(t— 1728)" 4(t— 1728) 


Viewed as a cubic in z, the left-hand side is an Eisenstein polynomial at 
the place t = 0 of Q(t) with discriminant 27317t?/(t — 1728)? ¢ Q(¢)*?. 
Therefore K is a nonnormal cubic extension of Q(t). We also see that 
the place t = 0 is totally ramified in K, while the places t = 1728 and 
t = co each split into two places, one ramified of degree 2 and the other 
unramified. A calculation using the Hurwitz genus formula then shows that 
the genus of Xo(2) is 0. By itself, this says little, because over Q there are 
infinitely many mutually nonisomorphic smooth projective curves of genus 
0. However, it is easy to see that Xo(2) has a rational point: for example, 
observe that at either place of K above t = ov, the residue class field is Q. 
It follows that X9(2) is isomorphic to P! over Q. 

Returning to the general case, we must still verify that @N K = Q and 
that up to isomorphism K is independent of the choice of FE and C. The 
verification will ultimately lead us to modular functions. 

We begin with some notation and conventions. Let k be a field of char- 
acteristic not dividing N. Given a Galois extension k’ of k containing the 
group pn of N-th roots of unity, we shall write « : Gal(k’/k) — (Z/NZ)* 
for the character giving the action of Gal(k’/k) on py: 


a(¢)=¢") (a € Gal(k’/k), ¢ € ny). 


Suppose now that F is an elliptic curve over k. Let E[N] C E(k) denote 
the subgroup of points of order dividing N, and write k(E[N]) for the 
finite Galois extension of k generated by the coordinates relative to some 
generalized Weierstrass equation for E over k of the affine points on FE of 
order-dividing N. After fixing an ordered basis for E[N] over Z/NZ, we 
may identify the natural embedding of Gal(k(E[N])/k) in Aut(Z[N]) with 
a faithful representation 


p: Gal(k(E[N])/k)  GL(2,Z/NZ). 


The formalism of the Weil pairing shows that k(E[N]) contains uy and 
that the determinant of p is «. In particular, if k itself contains uy, then 
« is trivial and p is a representation 


Gal(k(E[N])/k) 3 SL(2,Z/NZ). 


Theorem 1. [f E is an elliptic curve over C(t) with j(E) = t, then the 
representation on N-division points is an tsomorphism 


Gal(C(t, E[N])/C(é)) & SL(2, Z/NZ). 
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Theorem 1 will be proved later. For now we derive consequences. The 
first consequence is that if F is an elliptic curve over Q(uy)(t) with invari- 
ant t, then it is still true that the representation on N-division points is an 
isomorphism 


Gal(Q(t, ELN])/Q(uy)(t)) ¥ SL, Z/NZ). 


For we know that the left-hand side is embedded in the right-hand side, but 
on field-theoretic grounds we also have 


[QG, ELN]) : Qu) @)] 2 (Cl, ELN]) : C@)].- 


Hence the conclusion follows from Theorem 1. Next suppose that F is 
an elliptic curve over Q(¢) with invariant t. Then E can be viewed as 
an elliptic curve over Q(uy)(t), and consequently the representation on 
N-division points 


p: Gal(Q(¢, E[N])/Q(t)) 4 GL(2, Z/NZ) 


sends Gal(Q(t, E[N])/Q(uzn)(t)) onto SL(2,Z/NZ). Since det p = «, the 
image of p also contains a set of coset representatives for SL(2,Z/NZ) in 
GL(2,Z/NZ). Therefore the image of p is all of GL(2,Z/NZ), and we 
obtain the first part of the following assertion: 


Corollary. If EF is an elliptic curve over Q(t) with invariant t, then the 
representation on N-division points is an isomorphism 


Gal(Q(, ELN])/Q(t)) = GL, Z/NZ), 


and QN Q(t, E[N]) = Q(uy). 


The equation QN Q(t, E[N]) = Q(un) also follows from the argument just 
given. Indeed, put L = QNQ(t, E[N]), and suppose that L is strictly larger 
than Q(uyn). Then L(t) is strictly larger than Q(zy)(t), and consequently 


[Q(é, ELN]) : L(t] < |SL(Q, Z/NZ)|. 


But as before, the left-hand side of this inequality is a prior? greater than 
or equal to [C(t, E[N]) : C(t)], and we have a contradiction. 

We can now clarify some points in the definition of Xo(N). Recall that 
the recipe for the function field K of Xo(V) was as follows: Choose an 
elliptic curve F over Q(t) with 7(#) = t, and then choose a cyclic subgroup 
C' of E[N]; write G for the Galois group of Q(t, E[N]) over Q(t) and A for 
the subgroup of G preserving C; then K is the fixed field of H. Now any 
point of order N on F is the second basis vector in some ordered basis for 
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E[N] over Z/NZ, and consequently we may identify G with GL(2, Z/NZ) 
in such a way that AH is identified with the lower triangular group 


G 4 :a,d € (Z/NZ)*,be zinz}. 


Since the determinant maps the lower triangular group onto (Z/NZ)*, 
we have Q(uy) NK = Q. Substituting Q(uv) = QN Qt, E[N]) in this 
equation, we obtain Qn K = Q, as claimed earlier. We also see that up to 
isomorphism over Q(t), the field K is independent of the choice of C: for if 
we change the basis of E[.N] over Z/NZ, then we conjugate H inside G, and 
hence we conjugate (in the field-theoretic sense) K inside Q(t, E[N]). It 
remains to examine the dependence of K on &. Quite generally, if E is an 
elliptic curve over a field k of characteristic not dividing N, let k(Z[N]/+) 
denote the extension of k generated by the x-coordinates of the affine points 
on F of order dividing N. Then k(E[N]/+) is the fixed field of 


{o € Gal(k(E[N])/k) : o(P) = +P for every P € E[N]}. 


Returning to the case at hand, we can say that Q(t, E[N]/+) is the subfield 
of Q(t, E[N]) corresponding to the subgroup {+/} of GL(2, Z/NZ); thus 


Gal(Q(t, E[N]/+)/Q(é)) = GL(2, Z/NZ)/{+I}. 


Suppose now that E’ is another elliptic curve over Q(t) with j(#’) = t. 
Then FE and E” differ by a quadratic twist, and consequently so do the 
associated representations 


Gal(Q(t)/Q(t)) —- GL(2, Z/NZ) 


provided that bases for E[N] and E’[N] are chosen compatibly. It follows 


that the fields Q(t, E[N]/+) and Q(t, E’[N]/+) are equal and that the 
associated isomorphisms from 


Gal(Q(é¢, E[N]/+)/Q(t)) and Gal(Q(é, E’[N]/+)/Q(@)) 


to GL(2,Z/NZ)/{+I} are identical. Now the lower triangular subgroup 
of GL(2,Z/NZ) contains {+I}, so that K is a subfield of Q(t, E[N]/+). 
Hence K can be characterized as the subfield of Q(t, E[.N]/+) fixed by the 
image of the lower triangular subgroup in GL(2,Z/NZ)/{+I}, and this 
characterization is independent of the choice of E. 


1.2. Other modular curves. More generally, let H be any subgroup of 
GL(2, Z/NZ) satisfying two conditions: 

(i) -Ie dH. 

(ii) The determinant H —-+ (Z/NZ)* is surjective. 
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If K is the subfield of Q(t, E[N]) fixed by H, then the same argument 
shows that K is the function field of a smooth projective curve X(H) over 
@ which up to isomorphism is independent of the choice of E. As we have 
just seen, Xo(V) corresponds to the choice 


H= ne 7 :a,d€(Z/NZ)*, be zinzh 


another example is the curve X,(/V) corresponding to 


ne (5 i") a € (Z/NZ)*, be z/Nz} 


These two examples will be the primary focus throughout, and while we 
shall initially emphasize Xo(N), it will ultimately be X,(V) which pro- 
vides the broader context for the application to £-functions. We note in 
passing that if P is a point of order N on E and C is the subgroup which 
it generates, then the function field of X,(V) can be identified with the 
subfield of Q(t) fixed by {o € Gal(Q(t)/Q(t)) : o(P) = +P}, just as 
the function field of Xo(N) coincides with the subfield of Q(t) fixed by 
{a € Gal(Q(t)/Q(é)) : o(C) = C}. 

By construction, a modular curve X(H) with function field K comes 
equipped with a distinguished morphism to P! over Q, namely the mor- 
phism corresponding to the inclusion of Q(t) in K. It is conventional to 
refer to the finite set of points on X(H) lying over the point t = co of P! as 
cusps. If we remove the cusps from X(H), then we obtain an affine curve 
Y(H), which is usually denoted Yo(V) in the case of Xo(N) and Y,(NV) 
in the case of X,(NV). To illustrate the notion of a cusp, let us prove that 
Xo(N) bas at least one cusp rational over Q. Choose an elliptic curve E 
over Q(t) with invariant ¢ and split multiplicative reduction at the place 
t = oo of Q(t). For example, & could be the curve 


36 i 

t— 1728" t— 1728 

because the covariants c4 and cg of this equation satisfy —c4/cs = 1. Denote 
the place t = co of Q(t) simply by oo, identify the completion of Q(t) at co 
with Q((1/t)), and pick an extension of co to a place of Q(t), so that the cor- 
responding decomposition subgroup Gal(Q(t)/Q(t)).. of Gal(Q(t)/Q(t)) is 
identified with the Galois group of Q((1/t)) over Q((1/t)). By the theory 
of Tate curves, there is an isomorphism of Gal(Q(¢) /Q(t)).o-modules 


E(Q(@/é))) = QA) /4, 


where g € Q((1/t)) is a uniformizer and q” denotes the infinite cyclic group 
generated by gq. It follows in particular that 


E(N] & un 6 (q'/*)4/q? 


y? + ry = 2° — 
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as Gal(Q(¢)/Q(t)).o.-modules. Let {P,, P2} be the basis for EN] which cor- 
responds under the preceding isomorphism to the basis {g!/" mod q”, ¢}, 
where ¢ is a generator of uy. Relative to this basis, the action of the group 
Gal(Q(t)/Q(t)).. on E[N] is represented by matrices of the form 


(2 egy) @e GaBH/OC).0) 


In particular, the cyclic group C’ generated by P» is preserved by the de- 


composition group Gal(Q(t)/Q(t)).., whence the latter is contained in the 
subgroup of Gal(Q(t)/Q(¢)) fixing the function field of Xo(N). Thus the 
residue class degree of the restriction of oo to this function field is 1, and 


so the field of rationality of the corresponding cusp is Q. 


1.3. Moduli interpretation of Xo9(NV) and X,(N). Let k be an al- 
gebraically closed field, and consider pairs (€,C) consisting of an elliptic 
curve € over k and a cyclic subgroup C C E[N] of order N. An isomor- 
phism from a pair (€1,C,) to a pair (€2,C2) is an isomorphism from €, to 
Eg sending C, to C2. We write [€,C] for the isomorphism class containing 
(E,C) and Ello(V)(k) for the set of all isomorphism classes. Also, if $ is 
any subset of P!(k), then Ello(V)(k)gs denotes the set of all isomorphism 
classes [€,C] € Ello(NV)(C) such that j(€) € S. 

One can also consider pairs of the form (€,P), where P is a point of 
order N on €. An isomorphism from (£),P}) to (€2, P2) is an isomorphism 
from €, to €2 sending P; to P2. Note the equality of isomorphism classes 
[€,P] = [€,-P]. We write Ell; (V)(k) for the set of isomorphism classes 
and Fill; (V)(k)gs for the subset consisting of those [€, P] such that j(€) ¢ S. 

Next observe that if X is a modular curve and E is an elliptic curve over 
Q(t) with invariant t, then & can be viewed as an elliptic curve over the 
function field of X , because the latter field is naturally an extension of Q(t). 
In particular, since a point z € X(C) determines a discrete valuation ring 
O, of the complex function field of X, one can ask whether EF’ has good 
reduction at the maximal ideal m, of O,: if so, then reduction modulo 
m, yields an elliptic curve #, over C. After extending O, to a discrete 
valuation ring of C(t, E[N]) in some way, we can also consider the reduction 
map on N-division points ELN] —> E,[N], which is injective. Thus if P is 
a point of order N on E and C is the cyclic subgroup which it generates, 
then the reductions P, and C, are defined, and are still of order N. 

Given a subset S of P'(C), let P!(C)s denote the complement of S in 
P1(C) and X(C)s the inverse image of P'(C)s in X(C). For example, if 
S = {co}, then X9(N)(C)s = Yo(N)(C). 


Proposition 1. Let E be an elliptic curve over Q(t) with j(E) = t, and let 
S be a subset of P!(C) containing all places where E has bad reduction. Fix 
an ordered basis for E[N] over Z/NZ, let P be the second element of this 
basis, and let C' be the cyclic group of order N generated by P. Then the 
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map = +> [Ez, Cz] defines a bijection of Xo(N)(C)g onto Ello(N)(C)g. The 
same is true if Xo(N), Ello(N), and C are replaced by X,(N), El,(N), 
and P respectively. 


Proof. First of all, S necessarily contains 0, 1728, and oo. Indeed F differs 
from the curve 


ee ae ae 
t— 1728 t— 1728 
by a quadratic twist over Q(t), and therefore the discriminant of any equa- 
tion for & differs from the discriminant of the above equation by a sixth 
power in Q(t)*. Since the above equation has discriminant ¢?/(t — 1728), 
it follows that F bas bad reduction at 0, 1728, and oo. 

As usual, we identify Gal(Q(t, E[N])/Q(¢)) with GL(2, Z/NZ), and sim- 
ilarly Gal(C(t, E[N])/C(t)) with SL(2,Z/NZ). Let K Cc Q(t, E[N]) be the 
subfield corresponding to 


He 1c; ") :a,d € (Z/NZ)*,be zinz} 


so that CK is the function field of Xg(V) over C. Also let X be acurve with 
complex function field C(t, E[N]) and 7: X —+ Xo(N) the corresponding 
morphism. Given a point tg € P!(C)s, a point t9 € Xo(N)(C) lying over 
to, and a point #9 € X(C) lying over 29, we have a bijection 


SL(2, Z/NZ)/(H N SL(2,Z/NZ)) — {fiber of Xo(N)(C) over to} 
g(H NSL(2, Z/NZ)) -— x(gEo), 


because the extension C(t, E[N])/C(¢) is unramified outside the places of 
C(t) where EF has bad reduction, and hence in particular outside S. On 
the other hand, the maps 


SL(2, Z/NZ)/(H Nn SL(2,Z/NZ)) —> GL(2, Z/NZ)/H 


2 
2) 9(H NSL(2,Z/NZ)) > gH 
and 
(3) GL(2,Z/NZ)/H —- {cyclic subgroups of EF of order N} 


gH 9S 
are also bijections, as is the map 


(4) {cyclic subgroups of EF of order N} 
—- {cyclic subgroups of E,, of order N} 
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afforded by reduction modulo m,,. Now if z = m(gZo), then HE, = Ez, 
and (gC)., = Cz (note that these identifications are meaningful since the 
residue class field C is a subring of O,, and O,). Therefore composing the 
inverse of (1) with (2), (3), and (4), we see that the map 


{fiber of Xg(N)(C) over to} —> {cyclic subgroups of E,, of order N} 


is a bijection. But j(E.,) = to # 0,1728, whence Aut(E,,) = {+1}. 
Thus an automorphism of ,, sends each cyclic subgroup of E,, to itself. 
Consequently 


ae subgroups of E,,, 


of order N i — {[€,C] € Ello(V)(C)s = 3(E) = to}, 


Cr [E,,,C] 


is also a bijection, and composing this map with the previous one we see 
that z ++ [E,,Cz] is a bijection from the fiber of X9(N)(C) over to € 
P!(C)s to the subset of Ello(NV)(C)s consisting of isomorphism classes 
with j-invariant fo. 

The argument for X (JV) is similar. The only additional wrinkle is that 
P is not actually defined over the function field of X;(V)(C), but only over 
a quadratic extension. Thus P, depends on a choice of a valuation ring 
lying over O,. However, if we make the alternate choice then we simply 
replace P, by —P,. Since [E,, P,|] = [E.,—P,], the map t + [E,, P;] is 
still well defined. 


1.4. Proof of Theorem 1: a preliminary reduction. We claim that 
to prove Theorem 1 it suffices to prove the following: 


Proposition 2. There exists an elliptic curve E over C(t) with invariant 
t such that 


(C(t, ELN|/+) : C(@®] = |SL(2,Z/NZ)/{+1}}. 
Granting Proposition 2, let us see how to deduce Theorem 1. Suppose that 
E’ is any elliptic curve over Q(t) with invariant ¢. Then E’ differs from 


E by at most a quadratic twist, so that C(t, E[N]/+) = C(t, E’[N]/+). 
Consequently 


[C(t, B’[N]/+) : C(¢)] = (CU, E[N]/+) : C(é)] = |SL(2,Z/NZ)/{+I}}, 


and the embedding Gal(C(é, E’[N]/+)/C(t)) —> SL(2,Z/NZ)/{+I} is an 
isomorphism. Hence the image of the representation 


Gal(C(t, E’[N])/C(t)) — SL(2, Z/NZ) 


MODULAR CURVES 49 


contains a set of coset representatives for {+J} in SL(2,Z/NZ) and so in 
particular contains either the matrix 


0 -1 

i 
or its negative. Since the square of this matrix is —I, we find that the 
image is all of SL(2,Z/NZ), and Theorem 1 follows. 


1.5. Modular functions. It remains to prove Proposition 2. The argu- 
ment will use the theory of modular functions. We begin by recalling the 
relevant definitions. 

Let § denote the complex upper halfplane, and let GL*(2,IR) denote 
the subgroup of GL(2,R) consisting of matrices with positive determinant. 
We consider the usual action of GLt(2,R) on § by fractional linear trans- 
formations: for 


_(@ 6 + 
1=(6 1) ECL (2, R) 


and z € § put 
_ azt+b 


cz +d 


If f is a function on § we write f oy for the function z+ f(yz). Now 
suppose that I is a subgroup of finite index in SL(2, Z). A modular function 
for I is a meromorphic function f on § satisfying two conditions: 


(i) foy=f foryeT. 
(ii) Given 6 € SL(2, Z), let M be a positive integer such that 


(f 06)(z + M) = (f 06)(2). 


(Such an integer exists by (i), because I has finite index in SL(2, Z) 
and consequently some power of the matrix 6({ ;) 6~! belongs to 
I.) Let F' be the meromorphic function on the punctured unit disk 


D° = {gE C:0 < |q| < 1} defined by 
{Oza Fe) (z € §). 


Then F' extends to a meromorphic function on the full unit disk 
D={qeEC: |g] < 1}. 


Like any meromorphic function on the punctured disk, the function F' in 
(ii) is represented by a Laurent series 


F(q) = > a(n)q” 


néZ 
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for q near 0, and the content of (ii) is that F' does not have an essential 
singularity at qg = 0. Thus (ii) is equivalent to the condition that for every 
6 € SL(2,Z), the function f o 6 has a Fourier series expansion of the form 


f(6z)= S° a(nje™"*"*/" — (Im(z) > 0) 


n>7No 


with ng € Z. Of course if ! = SL(2, Z) then f 06 = f by (i), and to verify 
(ii) it suffices to check that f itself has such a Fourier expansion. 

The usual operations of addition and multiplication of meromorphic 
functions make the set of modular functions for [ into a field IN(T). For 
the proof of Proposition 2 we are interested in the cases where I is the 
group SL(2,Z) (also denoted ['(1)) or the group ['(NV), the kernel of the 
reduction-modulo-N map SL(2,Z) — SL(2,Z/NZ). The quotient group 


T(1)/{+I}P(N) & SL(2,Z/NZ)/{+I} 


acts as a group of automorphisms of IN(I'(N)) with fixed field M(T(1)), 
and therefore IN(['(NV)) is a Galois extension of M(I'(1)) with Galois group 
isomorphic to SL(2,Z/NZ)/{+I}. For the record, we make an explicit 
identification 


6: SL(2,Z/NZ)/{+I} — Gal(m(r(N))/IM(T(1))) 
by declaring that if y € SL(2,Z) and f € M(T(N)) then 


Abl)(f) = for’, 


where [7] denotes the image of 7 in SL(2,Z/NZ)/{+I} and +' denotes the 
transpose of +. 

An essential point for the proof of Proposition 2 is that It(I'(1)) is 
generated over C by a single nonconstant function, the j-function. Like 
the functions gz and g3 appearing in the formula 


93 


G28 
93 — 2793 


which defines it, the j-function should be viewed in the first instance as 
a function of a lattice variable £ C C. We obtain functions of z € 4 by 
putting go(z) = go(Lz), 93(z) = ga(£z), and j(z) = 7(£z), where 


Le =27@ 2: 


Now as functions of lattices, gy and g3 are defined by the formulas 


gx(L) = 60S w* and gs(L) = 140 Sow, 
wel wel 
w#0 w#t0 
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which give at once the behavior of gz and g3 under homothety: 
g2(AL) = A~*ga(L), and g3(AL) = A~°g3(L) for all AE C%. 


It follows that as a function of lattices, 7 is invariant under homothety. 


Furthermore, if z € § and y= € 4 € SL(2, Z), then 


£L,=zZ@Z=(az+b)Z0 (ez + d)Z= rLy, 


with A = cz+d, whence j(yz) = j(z). This is condition (i) in the definition 
of modular functions; to verify (ii) we write z = + + 7y and observe that 


lim go(x +iy) =120 4 n~* and lim g3(x+iy) = 2805 n~® 
jim, 92(z + iy) 2X jim gs(x + éy) » 


uniformly in z. Thus the holomorphic functions Gz and G3 on D° such 
that go(z) = Go(e?™*) and g3(z) = G3(e?""*) extend holomorphically to 
D, and consequently the function J = G3/(G3 — 27G3) extends at least 
meromorphically to D. Hence j is a modular function for SL(2,Z). Now 
the calculation 


(120 S$) n~*)8 — 27(280 §  n-8)? = (1204/90)? — 27(2807/945)? = 0 


n>1 n>1 


shows that J(q) actually bas a pole at g = 0, and a more thorough analysis 
reveals that the pole is simple with residue 1. Therefore 7 has an expansion 
of the form 


L 
j(z) = — + power series in g 
q 


for g = e?"** near 0. In fact the Fourier expansion of 7 holds for all q in 
the unit disk, i-e., for all z € 5, because 7 is holomorphic on : indeed the 
properties of the Weierstrass g-function show that g3 — 279? is nowhere 
vanishing as a function of lattices and hence also as a function of z € 9. 
From the fact that 7 is holomorphic on § with only a simple pole as a 
Laurent series in g, one deduces that for any f € S(T'(1)) there exist 
polynomials P(t), Q(t) € C(t) with P(t) 4 0 such that P(7)f — Q(y) is 
holomorphic on § and 


jim P(s(z)) f(z) — QG(z)) = 0 


uniformly in z. An application of the maximum principle on a suitably 
truncated fundamental domain for SL(2,Z)\ then gives P(j)f = Q(j), 
whence J(T'(1)) = C(y) as claimed. 
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1.6. Elliptic functions. To summarize, It([(1)) = C(j), and M(TN)) 
is a Galois extension of C(j) with Galois group 


Gal (M(P(N))/I(L(1))) & SL, Z/NZ)/{4I}- 
Consider the elliptic curve 


274 27) 


E:y? = 42° — —=—_2z —- —_ 
yO 51798" F— 1728 


over C(j). We will show that C(j, E[N]/-+) coincides with It(T(N)) when 
both fields are viewed inside a fixed algebraic closure of C(j). Thsi will 
prove Proposition 2. 

The additional ingredient needed at this point is the Weierstrass para- 
metrization of elliptic curves over C. Let £ be a lattice in C and consider 
the elliptic curve 


EWst .y? — 4X3 — go(L)X — g3(L). 
We recall that the Weierstrass ¢-function 


1 


Ww 


1 1 
p(u;L) = etary = 
w0 


affords a complex analytic group isomorphism 


C/L paar ENC) 
ut+Lr— (p(u;L), '(u;L)), 


where (o(u; L), o’(u; £)) is to be interpreted as the point at infinity ifu € L. 
For present purposes we must modify the classical normalization slightly. 
Assume that j(£) 4 0, 1728 and consider the elliptic curve 


277(L) 27j3(L) 

oh ee or ae Peek: ) 

ery = 40" — Fp) 178" iL) —- 18 

Let (ga(L)/g3(L))9/2 denote a fixed square root of (g2(L)/93(L))°. On 
rewriting the relation j(L) = 1728g92(L)?/(g2(L)° — 2793(L)) in the form 


golL)> ___27i(L) 
ga(L)2  j(L) — 1728" 


we see that the change of variables 


X = (g93(L)/g2(L))t,  ¥ = (93(£)/92(L))*/*y 
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transforms the equation for € into the equation for €EWS*, Thus we can 
replace the map u+Lt+ (p(u; L), p'(u; £)) by the map 


g2(L) 1 my (92(L)\7 
u+£Lr-> (ae g(u; L), (25) g (u; a) 


to obtain a complex analytic group isomorphism of C/£ onto E(C). In 
particular, if we fix a basis {w),w2} for £, then the numbers 


ell) = 2D (MAH 


aay N ? 


are the x-coordinates of the affine N-division points on €. Now as a function 
of u, ¢(u; £) is periodic with respect to £, even, and of degree 2 when viewed 
as a map C/£—> P!(C). Therefore 


c) (r,s €Z, (r,s) # (0,0) mod N) 


Dp s(L) = Lpts'(L) <=> (r,s) = 4(r’, 5’) mod N. 


Letting R denote the set of orbits of (Z/NZ)*— {(0,0)} under the negation 
map, we see that if (r,s) runs over a set of representatives in Z* for the 
distinct elements of R, then the numbers z,,,(£) are distinct. 

Now let P(w;A,B) € Z[w,A,B] be the N-th division polynomial, a 
universal polynomial with the property that P(wo;A,B) = 0 if and only 
if wo is the x-coordinate of an affine N-division point on the elliptic curve 
y*? = 4° + Ax+ B. Applying this property to the elliptic curve €, we find 


that 
_—-275(L) 275(L) _ 
P Gaze j(L) — 1728’ a(£) = ins) a 


whenever j(£) # 0,1728. In particular, let us take £ = £L,, where j(z) # 
0,1728. Setting 


= g2(2) eee T,Ss s mo 
fnele) = 8p (TF ic,) (rise Z, (7.5) # 0,0) mod ¥), 


we have f,,(z) = Z7,5(£,) and consequently 
27j(z) 275 (2) 
P T,s ax, SSO LN Geno tL = 
(1 (2) Gy — 1708’ f(a) 1708) ~° 


Since this equation holds for all z such that j(z) 4 0, 1728, it holds identi- 
cally; in other words 


275 275 
P Tat Nig Me! 
(CF j — 1728’ j- Ta) : 
in the field of meromorphic functions on §. Therefore the functions f,. 


are x-coordinates of affine N-division points on the curve & over C(j) with 
which we started. In fact the functions f,,, comprise all such r-coordinates: 
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Proposition 3. The set of x-coordinates of affine N-division points on 
the elliptic curve 
Digs 27; 
7-18 78 


coincides with the set of functions f,,5(z) = (g2(z)/93(z))@ (" Spits | 


in any algebraic closure of C(j) containing these functions. Therefore 
Cj, ELN]/+) = Cy, {f-,s})- 


Proof. As (r,s) runs over a set of representatives for R the functions f, 
are all distinct, because their values are distinct at any z.such that j(z) # 
0,1728. Since each function f,, is the z-coordinate of an affine N-division 
point on &, and since the number of such z-coordinates, like the number 
of functions f,, is |R|, we conclude that the functions f,, are precisely 
the z-coordinates of the affine N-division points on E. 


1.7. Completion of the proof. The proof of Proposition 2 and hence 
of Theorem 1 is completed by combining Proposition 3 with the following: 
Proposition 4. IN(T(N)) = C(j, {f,s}).- 


Proof. Let us use the notations f,, and f(,,;) interchangeably. The proof 
rests on two assertions: 


(i) f(s) °Y = frs)y for y € SL(2, Z). 

ii ere is a meromorphic function on D whic yy s the meromor- 
Th hic f D which ds th: 
phic function F,,, on D° defined by f,,5(z) = F,.s FP ,(e2"2/N), 


Assertion (i) follows after a calculation from the relations 

ga(cL) =c“ga(L), ga(cL)=c °ga(L), and p(cu,cL) =c *p(u, L). 
For (ii) one uses the definition of g(u, £) as a sum over lattice points to 
show that limy—.oo f;,s(Z + iy) exists uniformly in +. Now (i) implies that 

i One frs for 7 € P(N), 
while (i) and (ii) together imply that if 6 € SL(2, Z) then the meromorphic 
function F on D° defined by 
f,s(62z) = F(e?™7/") 


extends to a meromorphic function on D (put (7r’,s’) = (r,s)6; then F = 
F, 51). Therefore f,; belongs to NUI(N)). To see that the f,, actually 
generate IN(I'(N)), we use (i) again: if the field inclusion C(j, {f,,s}) € 
y(T(N)) were proper, then C(j,{f,s}) would be fixed by a nontrivial 
subgroup of the Galois group 


T(1)/{4H(N) = SL, Z/NZ)/{+I}. 
But a subgroup of SL(2,Z/NZ)/{+I} which acts trivially on F is trivial. 
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1.8. A normalized basis. The arguments just completed lead to a nearly 
canonical choice of basis for E[N] and bence to a nearly canonical iden- 
tification of Gal(C(j, E[N])/C(j)) with SL(2, Z/NZ) for any elliptic curve 
E over C(j) with invariant 7. To formulate the result, let us say that FE 
has good reduction at a point z € 9 if E has good reduction at the place 
j = j(z) of C(7). The reduction of & will be denoted E,. Recall that we 
have fixed an isomorphism 


0: SL(2,Z/NZ)/{+I} — Gal(M(T(N))/C)) 
by requiring that for y € SL(2,Z) and f e M(T(N)), 


O(b1)(F) = for’, 


where [] denotes the image of + in SL(2,Z/NZ)/{+I}. 


Proposition 5. Let FE be an elliptic curve over C(j) with invariant j, and 
view C(j, E[N]/+) and M(T(N)) as subfields of a fixed algebraic closure 
of C(J). 

(i) C(j, BIN]/+) = MN(TUN)). In particular, for any z € 4, evalua- 
tion at z defines a place of C(j, E[N]/+). Henceforth we fix a place z of 
C(j, E[N]) extending evaluation at z on C(j, E[N]/+), and if E has good 
reduction at z and P € E[N], then P, € E,(C) denotes the reduciton of P 
at Z: 

(ii) There is a basis {P,,P2} for E[N], unique up to replacement by 
{—P,,—P2}, with the following properties: 

(a) Let p: Gal(C(j, ELN])/C(j)) —> SL(2, Z/NZ) be the isomorphism 
corresponding to {P,, Pa} and p* : Gal(C(j, E[N]/+)/C(j)) — 
SL(2,Z/NZ)/{+I} the induced isomorphism. Then p+ = 67}. 

(b) If z € H is a point where E has good reduction, then there is a 
complez analytic group isomorphism of C/L, onto E,(C) sending 
1/N +L, to (P2)z- 


Proof. (i) Since C(j, E[N]/+) depends on £& only up to quadratic twist, 
this follows from Propositions 3 and 4. 
(ii) Choose an equation for E over C(j) of the form 


27; 27; 
Dom. Dok 2 Ae a 
o(d)y" = 4a" — 598" — Foe 


where c(t) € C[é] is a polynomial with simple zeros. Then E has good 
reduction at z € § if and only if j(z) # 0,1728 and c(j(z)) 4 0. Now 
we have seen (in the case c(t) = 1, and hence in general) that the x- 
coordinates of the affine points of order N on EF are the functions f,,, with 
(r,s) € Z and (r,s) #0 mod N. Thus for each such pair (r,s) there is a 
point P,, € E[N] such that x(P,.;) = f,,s. Of course the definition of P,., 
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represents an arbitrary choice from among two possibilities. We also make 
an arbitrary choice of square roots (g2(z)/93(z))9/2 and c(j(z))!/? at each 
point z € § where E& has good reduction, and we let A, : C/L, — E,(C) 
denote the complex analytic group isomorphism afforded by the map 


(PA vac.) (BAY” ou Le) 
° (28a £9. ( 25) eee) |. 


Now choose any point z9 € 9 where & has good reduction, and let P, and 
P» be the preimages of \,(zo/N+2£.,) and 4,,(1/N+Z£,, ) respectively un- 
der the isomorphism P ++ P,, of E[N] onto E,,[N]. Then {(Pi)z,,(Pa)z} 
48a basis ‘for &, {Nj and a fortiori {P\, Po} is a basis for E[N]. We-claim 
that 


(1) TP, + 8P5 => a ee 
Since the reduction map is injective on torsion, it suffices to check that 
(Pi +SP2)2, =] ths): 


ve : + i), while the right-hand side has 
z-coordinate f, (zo). Therefore equality holds. 

To verify (a), take o € Gal(C(j, E[N]/+)/C(j)), choose an element & € 
Gal(C(j, E[N])/C(j)) which restricts to a, and select  € SL(2, Z) so that 
the image of y in SL(2,Z/NZ) is p(a). Then p*+(c) = [9], and the identity 
to be proved is O([y]) = o. Since the f,,, generate IN(T(N)) over C(7) 
(Proposition 4), it suffices to check that 0([7])(f,,s) = o(fr,s)- Write 


The left-hand side is X,, 


As we bave seen in the proof of Proposition 4, 


9([y])(fr,s) = Tris oy = F(r',st)- 


On the other hand, 7’ P, + s’P2 = +P, ,, whence 
fr'ys! = x(r’'P, + s' Po = t(p(a)(rPy + sP2)) = o(a(rPy + sP )). 


By (1), the last term is o(f,,;), and (a) follows. 

For (b), suppose that & has good reduction at z. Since the z-coordinate 
of (Po), is foi(z), we have A,(1/N + £,) = +(P2)z. Hence either A, or 
—X, sends 1/N + £, to (P2)z. 

Finally, suppose that {P], P,} is another basis for E[N] with properties 
(a) and (b). Choose a point z € § where E has good reduction, and let 
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XN, : C/L, —> E,(C) be a complex analytic group isomorphism sending 
1/N + £, to (P4),. Then AZ'o XM € Aut(C/L,). Since E has good 
reduction at z we have j(z) # 0,1728 and consequently Aut(C/£,) = 
{+1}. Hence after replacing {P,, P2} by {—P,, —P2} if necessary, we may 
assume that P} = Pj. Then the change-of-basis matrix sending {P,, P2} 
to {P{, P£} is a lower triangular matrix with 1 in the lower right-hand 
entry. Furthermore, conjugation by this matrix induces the identity on 
SL(2,Z/NZ)/{+I}, because {P{, P}} also has property (a). It follows that 
the change-of-basis is the identity, as desired. 


1.9. Quotients of the upper half-plane. We will use Proposition 5 to 

realize the modular curves as compactified quotients of . First we must 

recall how such quotients are given the structure of a Riemann surface. 
Put 


5* =H U P'(Q). 


The action of SL(2,Z) on § by fractional linear transformations extends 
to an action on §* preserving P'(Q), and if I is any subgroup of finite 
index in SL(2, Z), then we denote the respective orbit spaces of 5*, 5, and 
P1(Q) under I by '\H*, [\H, and [\P!(Q). Thus 


r\s* = ([\H) U (F\P*Q)). 


Since I has finite index in SL(2, Z) and SL(2, Z) acts transitively on P!(Q) 
the set ['\\P!(Q) is finite. 

We would like to put a topology on ['\*. First we put a topology on 
§* itself. Given yo > 0 and c € P'(Q), choose a matrix 6 € SL(2,Z) such 
that c = doo, and put 


Uy, ={xtiy: re Ry>y}CS, 
Po 250), aad Ue = ULE 


c,yo c,yo 


The sets U?,,, and Uc,y, depend only on c and yo, not on the choice of 
6, because U,, is preserved by the stabilizer of oo in SL(2,Z), namely 
{+(} 7):n€Z}. We make §* into a topological space by choosing as a 


basis of open sets all sets of the following two types: 
(a) open subsets U of 9, 
(b) subsets of 5* of the form U,,,. 
Then the quotient toplology on ['\* corresponding to the natural projec- 
tion 
nr: 5° —+I\s5* 
makes [\* into a compact Hausdorff space. 


The next step is to make [\5* into a compact Riemann surface. Let 
F be the sheaf of continuous complex-valued functions on ['\*, and F, 
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the stalk at a point z. We think of 7, as the set of equivalence classes of 
pairs (f,V), where V is an open neighborhood of x and f is a continuous 
complex-valued function on V, two pairs (f, V) and (g, W) being equivalent 
if f and g coincide on VN W. To make ['\* into a Riemann surface, we 
must define a subsheaf O of F to serve as the complex structure sheaf. 
We define O by specifying that its stalk O, at z is the subring of F, 
consisting of those equivalence classes which contain a pair (f, V) of one of 
the following two types: 


(a) There exists z € § and an open neighborhood U of z in § such that 
z=n(z), V=2(U), and f om is holomorphic on U. 

(b) There exists c € P'(Q) and yo > 0 such that t = a(c), V = 
T(Ucyo), and f om satisfies the following condition. Choose 6 € 
SL(2,Z) such that c = éco, and let M be a positive integer such 
that (f om0d)(z+M) = (fowod)(z) for z € U,,. (Such an integer 
exists because I has finite index in SL(2, Z) and a is invariant under 
Tr.) Put r = e~?™90/ and let F be the function on the punctured 
disk D°(r) = {¢q €C:0 < |g| <r} such that 


(f 0m 06)(z) = F(e2™™7/™), 


Then F is holomorphic on D°(r) and extends to a holomorphic 
function on the full disk D(r) = {q€ C: |q| <r}. 


One can check that with this definition of O, every point + of I'\§* has 
an open neighborhood V such that the ringed space (V, O|y) is isomorphic 
to the ringed space of an open disk in C. (The verification requires a little 
care if z is the image of an elliptic fixed point of T, ie., if c = m(z) for 
some z € § which is fixed by an element of T different from +/.) Granting 
that this is so, we conclude that O gives [\* the structure of a Riemann 
surface. Furthermore, and this is now the key point, the definitions have 
been constructed in such a way that the map 


fr (fon)|4 


identifies the function field of [\* with IN([). Note that both [\H* and 
I(T) depend only on I’, the image of I in SL(2, Z)/{+]}. 


1.10. Modular curves as quotients of the upper half-plane. Given 
a modular curve X(H), we shall now produce a subgroup I of SL(2, Z) 
such that the Riemann surfaces X(H)(C) and I'\* are isomorphic. By 
assumption, H is a subgroup of GL(2,Z/NZ) satisfying two conditons: 
—I € H and det : H — (Z/NZ)” is surjective. We let T C SL(2, Z) be the 
transpose of the inverse image of HN SL(2,Z/NZ) under the reduction 
map SL(2, Z) — SL(2, Z/NZ). 
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Proposition 6. With H andT as above, the Riemann surfaces X (H)(C) 
and I'\* are isomorphic. 


Proof. Let E be an elliptic curve over Q(j) with invariant 7, and iden- 
tify Gal(Q(j, E[N])/Q(j)) with GL(2, Z/NZ) using a basis for E[N] as in 
Proposition 5. The function field of X(H) over Q is the subfield K of 
Q(j, E[N]/+) fixed by H, whence the function field of the Riemann sur- 
face X(H)(C) is CK. Now our identification of Gal(Q(j, E[N])/Q(j)) with 
GL(2, Z/NZ) affords an identification 


Gal(C(j, E[N]/+)/C(j)) = SL(2, Z/NZ)/{+I}, 


and the hypotheses on H imply that CK is the subfield of C(j, E[N]/+) 
fixed by (HNSL(2, Z/NZ))/{+I}. Applying parts (i) and (ii)(a) of Propo- 
sition 5, we deduce that CK = I(T), whence the result follows from the 
fact that a compact Riemann surface is determined up to isomorphism by 
its function field. 


In particular, put 
To(N) = ee .) € SL(2,Z) :c =0 mod wh 


and 


Ty(N) = ie i) € SL(2,Z):c=0mod N,a,d= 1 mod Nt 
Then Xo(N)(C) & To(N)\H* and X1(N)(C) = [1 (N)\H* (in the latter 
case we use the fact that ['\5* depends only on T). Now consider pairs 
(T,C) consisting of a one-dimensional complex torus T and a cyclic sub- 
group C of T of order N. An isomorphism from one pair (71, Ci) to another 
(T2,C2) is a complex analytic group isomorphism from 7; to 72 sending C, 
onto Ca. We denote the isomorphism class containing (T,C) by [T7,C] and 
the set of all isomorphism classes by Torig(N). For a point P of order N 
on T we make the analogous definitions of (J,P), [7,P], and Tori,(NV). 


Proposition 7. Let E be an elliptic curve over Q(j) with invariant j, and 
let S be a subset of P!(C) containing all places where E has bad reduction. 
Fiz an ordered basis for E|N]| over Z/NZ, let P be the second element of 
this basis, and let C be the cyclic group of order N generated by P. Then 
there is an isomorphism of Riemann surfaces Xo(N)(C) = To(N)\H* such 
that the diagram 


Xo(N)(C)s ——— Ellp(N)(C)s 


| | 


To(N)\H —— Torig(N) 
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commutes, where: 


— The top horizontal arrow is the bijection z +> [E,,Cz| of Proposi- 
tion 1. 
— The bottom horizontal arrow is a bijection and has the form 


[2] [C/Lz,(1/N+L2)] (2 € 9), 


where {z] denotes the class of z inT9(N)\SG and (1/N+L,) denotes 
the cyclic subgroup of C/L, generated by the coset of 1/N + Lz. 
— The left vertical arrow is the restriction to Xo(N)(C)g of the iso- 
-morphism. Xo{N.)(C) = Tp (N)\9*- 
— The right vertical arrow is the restriction to Ellop(N)(C)gs of the 
bijection from Ello(N)(C) to Torig(NV) given by [€,C] + [E(C), C]. 
The same is true if Xo(N), Ellp(V), To(V), Torig( NV), C, and (1/N + L£,) 
are replaced by X\(N), El (NV), [1 (V), Torii(NV), P, and 1/N + L,. 


Proof. Without loss of generality we may assume that P is the second 
basis vector in a basis for E[N] chosen as in Proposition 5. Then the only 
statement requiring proof is the bijectivity of the bottom horizontal arrow. 
The cases ['9(N) and I (V) are similar; we deal with the latter. Suppose 
that 


[C/Lz,1/N + Lz] = [(C/Ly,1/N + Ly]. 


Then there exists w € C* so that Lz =wL, and1/N =w/N (mod wL,). 
The first condition implies that {w,wz} is a basis for £,,. Hence we can 


write 
z' =w(az +b) 
1 = w(cz + d) 


with integers a, b,c, d satisfying ad — bc = +1. Since z’ = (az +b)/(cz +d) 
and z and z’ both have positive imaginary part, it follows that ad—bc = 1. 
Substituting 1 = w(cz +d) in the congruence 1/N =w/N (mod wL,), we 
find that c=0 (mod N) and d=1 (mod N), whence a= 1 (mod N) also 
since ad — bc = 1. Thus 2’ = yz with 


r= (6 a) en, 


and consequently [z] = [z’]. Next suppose that [T,P] € Tori,(V). Write 
T=C/Land P=w/N+CL with w € £. After replacing w by another 
element of w+ NL, we may assume that w is primitive, so that w is part 
of a basis {w’,w} for £. Put z = +w’/w, where the sign is chosen so that 
Im(z) > 0. Multiplication by w7 gives an isomorphism of (C/L,P + L) 
onto (C/L,,1/N + Lz), whence [T,P] coincides with (C/L,,1/N + L£;]. 
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2. HECKE CORRESPONDENCES 


By a correspondence on a smooth projective curve X we shall mean a 
triple T = (Z,y,w), where Z is a smooth projective curve and y and w 
are nonconstant morphisms Z — X. We say that T is defined over a field 
kif X, Z, py, and w are all defined over k. We view an automorphism 6 of 
X as a special case of a correspondence by putting Z = X, yw = idx, and 


p= 6. 


2.1. The Hecke correspondences on X(N). Let N be a positive 
integer, p a prime number, and M the least common multiple of N and p. 
Choose an elliptic curve E over Q(t) witb invariant ¢, and fix a basis for 
E([M] over Z/MZ, whence an identification of Gal(Q(t, #[M])/Q(¢)) with 
GL(2,Z/MZ). We consider the subgroup 


#={(5 a) € GL(2,Z/MZ) :c=0mod N, b= 0 mod p 


Since —I € Hp and det(H,) = (Z/MZ)*, the fixed field of Hp is the 
function field of a smooth projective curve over Q, which we shall denote 
Xo(N,p). The Hecke correspondence Tp, on Xo(NV) is a correspondence 
over Q of the form 


Tp = (Xo(N, D), Pop, Wp), 


where the morphisms %p, Wp : Xo(N,p) —- Xo(N) must now be defined. 
The definition of yp is straightforward. Let Kp, and K denote the fixed 
fields of Hp and 


H= He A € GL(2,Z/M2) :¢= 0 mod N} 


respectively. Then Hp C H, whence K C Ky. The latter inclusion is an 
inclusion of function fields and so corresponds to a morphism of curves 


Qp : Xo(N, p) = X(Hp) — X(A). 
But X(H) is X(N), because the kernel of the reduction map 
GL(2,Z/MZ) — GL(2, Z/NZ) 


is a subgroup of H and the image of H in GL(2, Z/NZ) is the lower trian- 
gular group. Therefore yp is a morphism from Xo(N,p) to Xo(JNV). 

The definition of w, is more subtle. It corresponds to an inclusion of 
function fields K’ + Kp, where K’ is a subfield of K, which is isomorphic 
to, but distinct from, K. To define K’, let us recall once again that our 
identification of Gal(Q(¢, E[M])/Q(é)) with GL(2, Z/MZ) rests on a choice 
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of basis for E[M] over Z/MZ and hence in particular on a decomposition 
of E[M] as a direct sum of cyclic subgroups of order M: 


E[M] = C1 ®Co. 


Let C denote the cyclic subgroup of C2 of order N, and let II denote the 
cyclic subgroup of C, of order p. Then C and II are stable under H,, hence 
defined over Ky. In particular, since II is defined over K, there is an elliptic 
curve E/II defined over K, together with an isogeny 


\:E— E/I 


over K, with kernel II. Furthermore, E/II bas a cyclic subgroup of order 
N defined over Kp, namely the subgroup A(C’). Now put 


¢ = j(£/T1) € Kp, 


and let E’ be an elliptic curve over Q(t’) with invariant t’. Then there is an 
isomorphism 6 : E/II > E’ over Ky, and @ is unique up to sign because t’ 
is transcendental, hence # 0, 1728. It follows that the group C’ = @(A(C)) 
is a cyclic subgroup of E’ of order N which is independent of the choice 
of 6. Furthermore, C” is defined over Kp because A(C) is defined over K, 
and go @og~! = +6 for o € Gal(K,/K,). Thus Kp contains the field K’ 
fixed by er 
{o € Gal(Q(#’)/Q(#’)) : o(C’) = C’}. 

Since K’ is isomorphic to the function field of X9(V) we obtain the desired 
morphism wW, from Xo(N,p) to Xo(N). 


2.2. The Hecke correspondences on X;(N). Mutatis mutandis, the 
same construction yields a correspondence 


Tp — (X(N, D), Gp, Vp) 


on X(N), where X(N, p) is the modular curve determined by the sub- 
group 


= Gt _c=0Omod N, 6=0mod p 
Heys ) € GL, Z/M2) : d= +1mod N } 


of GL(2,Z/MZ). Put 
H= He € GL(2,Z/MZ):c=0mod N, d= +1 mod wh 


and write K, and K for the subfields of Q(t, E[M]) fixed by H, and H. 
Then gp is the morphism X,(N,p) — X,(V) corresponding to the inclusion 
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of K in Kp. To define wp, let {P1, Po} be our chosen basis for E[M] and 
put P = (M/N)P». Also let II be the group of order p generated by 
(M/p)P,. As before, there is an elliptic curve £/II over K, and an isogeny 
A: E > E/T over K, with kernel II. Since K is contained in K, and the 
set {+P} is stable under Gal(Q(t, E[M])/K), it follows that {+A(P)} is 
stable under Gal(Q(t, E[M])/K,p). Putting t’ = j(£/TD) as before, we see 
that if E’ is any elliptic curve over Q(t’) with invariant t’ and 6: E/Il — E’ 
is any isomorphism over K,, then the point P’ = 6(A(P)) bas order N and 
{+P} is defined over K,. Hence K, contains K’, the field fixed by 


{o € Gal(Q(#’)/Q(t’)) : o( P’) = +P’}. 


Since K’ is isomorphic to the function field of X,(N) we obtain a morphism 
Wy from X1(N,p) to Xi (N). 


2.3. Moduli interpretation of the Hecke correspondences. We 
denote the free abelian group on a set W by Div(W). In particular, if 
X is a smooth projective curve over an algebraically closed field k, then 
Div(X(k)) is the usual group of divisors on X(k). Given a correspondence 
T = (Z,¢,W) on X, we use the same letter T to denote the map 


X(k) —+ Div(X(k)) 
s-—~> S° (mult,)¢(z), 


ZEZ 
p(z)=x 

where mult,y is the ramification index of ~ at z. In the case of the Hecke 
correspondence T, we shall give a formula for this map which displays its 
effect on isomorphism classes [€,C] and [€,?]. First a point of notation. 

Suppose that € is an elliptic curve over an algebraically closed field k and 
A is asubgroup of € of order p. In keeping with our usage thus far, we shall 
write €/A for an elliptic curve which is the image of a separable isogeny 
with domain € and kernel A. Note that €/A is unique up to isomorphism. 
Now if A: € — €/A is a separable isogeny with kernel A and C is a cyclic 
subgroup of € of order N which intersects A trivially (a vacuous condition 
if N is prime to p), then we obtain a well-defined isomorphism class 


[E/A,(C + A)/A] € Ello(V)(k) 


by putting [€/A,(C + A)/A] = [A(E), A(C)]. To see that [E/A,(C + A)/A] is 
independent of the choice of A, suppose that \’ : € — €/A is another such 
isogeny. Then there is an automorphism @ of €/A such that \’ = 004, 
whence [X’(E), A’(C)] = [A(E), A(C)]. Similarly, if P is a point of order N on 
E such that the cyclic subgroup (P) generated by P intersects A trivially, 
then we define 


[E/A, P + A] € EL (W)(k) 
by putting [€/A,P + A] = [A(E), A(P)]. 
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Proposition 8. Let E be an elliptic curve over Q(t) with invariant t. Let 
S, S', and S" be subsets of P!(C) containing all places where E has bad 
reduction and such that 


yp '(Xo(N)(C)s) C Xo(N, p)(C)s° 


and 


Wp(Xo(N, p)(C)s-) C Xo(N)(C) gv. 


Fiz an ordered basis for E[N] over Z/NZ, let P be the second element_of 
this basis, and let C' be the cyclic group generated by P. Then the diagram 


Xo(N)(C)s —2— Div(Xo(N)(C)s") 


| | 


Ellp(N)(C) ———> Div(Ello(V)(C)) 


commutes, where the left vertical arrow is the map x +> [E,,C,| of Proposi- 
tion 1, the right vertical arrow is the corresponding homomorphism between 
free abelian groups, and the bottom horizontal arrow is the map 


[E,C]-> S> [E/A,(C + A)/Al, 
[E[p]:A]=p 
CnA={0} 


the sum being taken over subgroups A of index p in E[p| which intersect 
C trivially. The same is true if Xo(N) is replaced by X,(N), Ellp(N) by 
Ell, (NV), the left vertical arrow by x ++ [E,, P,], and the bottom horizontal 
arrow by 


EPI > dD) [E/AP+Al, 
[E[p]:A]=p 
(P)NA={0} 
where (P) denotes the cyclic subgroup generated by P. 


Proof. For z € Xo(N)(C)g the formula 


T(z) = Ss) (mult. yp) Yp(z) 
ZEZ 
Pp(z)=a 
can be written simply as 


To(z)= >, ¥p(z), 


z€yp (x) 
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because the morphism yp : Xo(N,p) — Xo(N) is unramified outside S: 
indeed the corresponding extension of function fields K,/K is contained 
inside the extension Q(t, E[M])/Q(t) and is therefore unramified outside 
the places where & has bad reduction. Here M denotes the least common 
multiple of N and p, as before. 

Consider triples (€,C,A), where € is an elliptic curve over C, C is a 
cyclic subgroup of € of order N, and A is a cyclic subgroup of € of order p 
which intersects C trivially. We write [€,C, A] for the isomorphism class of 
(€,C, A) and Ello(N, p)(C) for the set of isomorphism classes. If we define 
maps y and w from Ello(N, p)(C) to Ello(NV)(C) by 


p({EsC,A]) =4€,C| 
and 


W([E,C, A]) = [E/A, (C + A)/Al, 


then the map 
[E,C]-> ST [E/A,(C +A)/A, 
[E[p]:A]=p 
CnA={0} 
in the statement of the proposition has the form 
Et S> W(z). 


z€yp—}(z) 


Therefore it suffices to check that the diagrams 


Xa(N,p)(C)s) 2 Xo(N)(C)s 


| | 


Ello (NV; p)(C) sa Ello(V/)(C) 


and 
Xo(N,p)(C)g) —24 Xo(N)(C)sx 


| | 


Ello(N, p)(C) ae Ellp(V)(C) 


commute, where the right vertical arrows are the maps z +> [F,,C,| of 
Proposition 1 and the left vertical arrows are given by 


Xo(N, p)(C) gs — Ellp(N, p)(C) 
ze [E,,C,, I,]. 
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As before, II denotes a subgroup of F of order p which intersects C trivially, 
and the subscript z indicates reduction modulo the maximal ideal of the 
discrete valuation ring corresponding to z in the complex function field 
of Xo(N,p). Now the commutativity of the first diagram amounts to the 
equation 

[Ez, Cz] = [Ey,(z); Co,(z)]) 


which follows from the compatibility of reduction at a good place with 
base extension. To verify that the second diagram commutes put t/ = 
j(E/Tl), choose an elliptic curve E’ over Q(t’) with invariant ¢’, the image 
of (C + II)/II under an isomorphism E/II — E’. We must check that 


(1) [E./IL, (Cz + Il,)/z] = [E%,(2)» Cp, (z)|- 
We verify this equation in two steps. First, 
(2) [B./Tz, (Cz + z)/Uz] = [(B/1)z, ((C + I1)/T),] 


by the compatibility of reduction with isogenies. Next let @ be an isomor- 
phism from E/II to E’, so that C’ = 6((C + II)/II). Then the reduction 
of # gives an isomorphism from (£/TI), to (E")y,(2) sending ((C +11)/TI1), 
to (C")y,(z), whence 


(3) [(Z/M)z,((C + 1) /W)2] = (E)u,@: (C)v,@):- 


Together, (2) and (3) give (1). The argument for X(N) is similar. 


2.4. The Hecke correspondences on the upper half-plane. Given 
d € (Z/NZ)* we write (d) to denote any element of [o9() with lower 
right-hand entry congruent to d modulo N. Also, if d € Z is an integer 
prime to N then we write (d) for (d mod N). 


Proposition 9. There is a commutative diagram 


Xo(N)(C) —2+ Div(Xo(N)(C)) 


| | 


P'o(N)\H* ——> Div(To()\H*), 
where the left vertical arrow is the isomorphism of Proposition 7, the right 
vertical arrow is the corresponding homomorphism of free abelian groups, 
and the bottom horizontal arrow is the map 


ass ye ol(z + v)/p] + [pz] if ptN 
Pr ol(z + )/pI if p|N. 
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The same is true if Xo(N) is replaced by X\(N) and [o(N) by T(N) 
provided that the bottom horizontal arrow is modified as follows: in the 
case p{ N, the term [pz] is replaced by [(p)pz]. 


Proof. By a continuity argument it suffices to check that the diagram com- 
mutes when X9(N)(C) is replaced by Xo(N)(C)s for some finite set S. 
Propositions 7 and 8 then reduce the problem to the following: Given 
[T,C] € Torig(N) with JT =C/L, and C = (1/N + L,), show that 


d (7/4, +4)/4] 


[TZ [p]:A]=p 
CnA={0} 
coincides with 
p—-l1 
SCL aie: (1/N =i: L(z+v)/p)| a [C/Lpz, (1/N + Lpz), 
v=0 


the last term being omitted if p divides N. Now the subgroups of order p 
in C/L, which intersect (1/N + £,) trivially are 


(z+v)/p+L£L2) (O<v<p—1) 


and also 
(1/p + Lz) 


if p { N. Furthermore, for T = C/L, and A = ((z+ v)/p+L,) we have 
T/A = C/L(z4v)/p- Hence the only point to check is that if p{.N then 


[C/(2Z @p*Z), (1/N + (2Z ep *Z))| -_ [C/Lpz, (1/N te Lz). 


This holds because multiplication by p maps C/(zZ@p~!Z) isomorphically 
onto C/£L,, and because the cosets of 1/N and p/N generate the same 
subgroup of C/L,, for p{ N. 

The argument for T'|(V) is much the same, except that in the case pt N 
one must check that 


[C/Lpz,p/N + Lypz| = [C/Lipypz,1/N + Lip)pzl- 


Put w = pz and 7 = (p), and write 7 = & a . Since Ly = wZ@Z and 


d 
| awtlZ, ® Z, multiplication by cw + d defines an isomorphism from 
C/Lyy to C/Ly sending 1/N + Ly» to (cw+d)/N +Ly. The latter coset 


coincides with p/N + Ly, because 


(cw + d)/N —p/N = (c/N)w + (d—p)/N € Ly. 
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2.5. The diamond automorphisms. The next construction pertains 
only to X,(N). Choose an elliptic curve E over Q(t) with invariant ¢, and 
make the usual identification of Gal(Q(t, E[N])/Q(t)) with GL(2,Z/NZ), 
and of the function field K of X,(N) with the fixed field of 


a= ne € GL(2,Z/NZ) : a € (Z/NZ)*,b€ (Z/NZ)} 


Since H is normal in the lower triangular subgroup B, the quotient group 
B/H acts as a group of automorphisms of K and hence of X(N). We 


shall identify B/H with (Z/NZ)* /{+1} via the map sending the coset of 


. a modulo H to the coset of d modulo {+1}. The automorphism of 


X(N) corresponding to the coset of d € (Z/NZ)* modulo {+1} will be 
denoted (d). Of course we have already used the symbol (d) to denote an 
element of [p9(V). The next two propositions show that the notations are 
consistant and that (d) may also be used for the bijection from Ell; (V)(C) 
to itself given by [€,P] +> [€,dP]. The proofs are left to the reader. 


Proposition 10. Let E be an elliptic curve over Q(t) with invariant t, and 
let S be a subset of P!(C) containing all places where E has bad reduction. 
Fiz an ordered basis for E[N] over Z/NZ and let P be the second element 
of this basis. For d € (Z/NZ)™* the diagram 


X(N)(C)s “24 Xi(N)(C)s 


| | 


Bl,(N)(C) ——> El(N)(C) 
commutes, where the bottom horizontal arrow is the map [E,P] + [E, dP] 
and the vertical arrows are the map z +> [Ez, Cz] of Proposition 1. 


Proposition 11. There is a commutative diagram 


xynyc) “24 xy) 


| | 


Ti(N)\s* —— Pi()\5"*, 


where the bottom horizontal arrow is the map [z] +> [(d)z] and the vertical 
arrows are the isomorphism of Proposition 7. 


For the application to L-functions toward which we are heading it is 


enough to consider the curve X,(N), and henceforth Xo(V) will drop out 
of sight. 
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2.6. Hecke correspondences and the Frobenius automorphism. 
Let p be a prime not dividing N, and fix a prime ideal p of Q lying over p. 
Write F,, for the residue class field of p. Reduction modulo p will be denoted 
by a tilde: for example, if € is an elliptic curve over Q with good reduction 
at p then € denotes the elliptic curve over F, obtained from € by reduction 
modulo p. We define sets Ell,(N’)(Q) and Ell,(N)(F,) by replacing C by 
© or F, respectively in the definition of Ell,(N)(C). Thus Ell, (V)(Q) can 
be identified with the subset of Ell;(V)(C) consisting of classes [E, P] such 
that j(€) € Q. Also, we let Ell, (NV) (Q)ga denote the subset of Ell, (N)(Q) 
consisting of classes [E, P] such that € has good reduction at p. Under our 
assumption that p does not divide N we have a well-defined map 


Elli (N)(Q)ga — Ell (N)(Fp) 
[E, P] > [E, PI, 
because reduction modulo p is injective on V-torsion. 
We recall that an elliptic curve € over Q is said to have ordinary good 


reduction at p if € [p] has order p. If € has ordinary reduction at p then 
reduction modulo p defines a surjective map 


Elp] — Efp| 
with kernel a subgroup of E[p| of order (or index) p. In addition E[p] has 
exactly p other subgroups of index p. 
We use a Superscript p to indicate the image of an object under the 
Frobenius automorphism of F, and a superscript p~! to indicate the image 
under the inverse of the Frobenius automorphism. 


Proposition 12. Let € be an elliptic curve over Q with ordinary reduction 
at p, and let Ag be the kernel of the reduction map 


£tp] — Ep) 
Let P be a point of order N on €. If A is a subgroup of E[p| of index p, 
then 
~~  [é0, pe pho: 
irene a 
[EP ,pPP | fA A Ao. 


Proof. Let Ap : E — E/Ao be an isogeny with kernel Ag and pg : E/Ag > E 
the dual isogeny. The image of (€/Ag)[p] under pip is Ap. Indeed, since pig is 
a p-isogeny, the image of (E/Apo)[p] under yo is a group of order p; but if this 
group were not contained in Ag then the image of (€/Ao)[p| under A © po 


would not be zero, contradicting the fact that Ap o yo is multiplication by 
p. Now consider the commutative diagram 


Ep] 2 (E/Ao)[p] > Efp] 


| | | 


Epp 228s EI nuip) Seip 
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where the vertical arrows are reduction modulo p and hence surjective. 
Since the image of (€/Ao)[p] under po is Ao, 1 the commutativity of the 
right-hand square shows th that wo is zero on € ane Ao[p]. Now € is ordinary by 
assumption, and since € E/’Xo is isogenous to €, it too is ordinary. Hence the 
fact that ji9 is zero on E/Kolp] means that jig is a separable p-isogeny. But 
multiplication by p is inseparable. Consequently aa is an inseparable (and 
hence purely inseparable) isogeny of degree p, and we can write ova = #o8, 


where § is the Frobenius endomorphism of degree p and 0: E? — E/No is 
an isomorphism. Therefore 
[\o(E), Ao(P)] = [A(E), BP). 

But the left-hand side is [&/Ap, P + Ao], and the right-hand side is [£?, P?. 
Hence we get the stated equality. 

Now suppose that A # Ag. Choose an isogeny A: € — €/A with kernel 
A, and consider the curve A(E) (= €/A) together with its subgroup (Ao) 
of order p. Since A(E) is isogenous to €, it has ordinary reduction at p, 


and consequently the kernel of reduction mod p on X(€)[p] is a subgroup 
of order p. The calculation 


A(Ao) = A(Ao) = A({O}) = {0} 
shows that A(Ag) is contained in this subgroup and hence coincides with it, 
since both have order p. Therefore we can apply to A(E) and A(Ao) what 
we have already proved for € and Ag: 


(1) [M(E}/A(Ao), AP) + A(Ao)] = AE)”, ACP) 
But if w : A(E) — A(E)/A(Ao) is any isogeny with kernel A(Ao), then by 
definition 
[A(E)/A(Ao), AP) + A(Ao)] = [40 AE), (40 A)(P)]- 
We choose yz to be the isogeny dual to A, and then [(u 0 A)(E), (uo A)(P)] 


becomes [€, pP]. Thus the left-hand side of (1) is [€, pP], and (1) can be 
rewritten 

so <p —~—P 
(2) (é,pP| = €/A°,P +A]. 
Applying the inverse Frobenius automorphism to (2) we obtain the stated 
formula. 


Let Ell,(N)(Q)ora be the subset of Elli(V)(Q)ea consisting of classes 
[€, P] such that € has ordinary reduction at p. Reduction of isomorphism 
classes 


Ell (N)(Q)ora — El (N)(Fp) 
[E, P| +> (E, Pl, 
extends uniquely to a homomorphism 
red, : Div(Ell, (V)(Q)ora) —> Div(Ell, (V)(F,)). 
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Proposition 13. Let a, € Gal(Q/Q) be a Frobenius element at p. Then 
Ip = Op + p{p)o,* 
when both sides are regarded as maps 


Ell (N)(Q)ora —+ Div(Elli (N)(Q)ora)/Ker(red,). 


Proof. Let € be an elliptic curve over Q with ordinary reduction at p, and 
let P be a point of order N on €. We must show that T,([E€,P]) and 
(co, + p(p)o, )([E, P]) have the same image under red,. Now Proposition 
8 gives 
Tp([E, Pl) =D E/A,P + Al, 
A 


where the sum runs over subgroups A C E[p] of index p which have trivial 
intersection with the cyclic group generated by P. On the other hand, 


—i1 


(op + p(p)oz!)([E, P]) = [E7", P*] + p[E% ,pP7 J. 
Thus we must check that 


SE/A, P + Al = (E?, PP] + vie?’ pP?*). 
A 


This follows from Proposition 12, because the sum on the left-hand side has 
p+1 terms, exactly one of which coincides with the kernel of E[p| — E[p]. 


Given a smooth projective curve X and a correspondence T = (Z,, v) 
on X, we use the same letter T to denote the endomorphism w, 0 y* of 
its Jacobian variety Jac(X). If k is an algebraically closed field then the 
map on points Jac(X)(k) — Jac(X)(k) can be obtained from T : X(k) — 
Div(X(k)) by extending the latter map to Div(X(k)), restricting to the 
subgroup Div°(X(k)) of divisors of degree 0, and then passing to divisor 
classes. 

We are concerned with the case X = X\(N), T =T,. We denote the 
Jacobian variety Jac(.X1(V)) simply by J,(V). Also, the automorphism of 
Ji(N) induced by the diamond automorphism (d) of X,(V) will be denoted 
by the same symbol (d). If 2 is a prime and n a positive integer, then the 
ting of endomorphisms of J,(NV) acts on the abelian group J,(N)(é"]. So 
does Gal(Q/Q), because J(N) is defined over Q. In sketching a proof of 
the next statement we shall simply quote what we need from the work of 
Igusa [10], in particular the fact that X,(V) has good reduction at primes 
not dividing N. 
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Theorem 2. Let o, € Gal(Q/Q) be a Frobenius element at p. Then for 
£Ap andn>1, 


Tp = 0p + P(p)op 
as endomorphisms of J,(N)(é"]. 


Proof. Let X(N ) denote the reduction of X,(V) modulo p. As a map on 


points, the reduction map X,(N)(Q) — X,(N)(F,) is compatible with the 
map [€,P] + [€,P] in a sense which we shall now describe. 

Let Z[t](p) denote the localization of Z[t] at the prime ideal generated by 
p. We say that an elliptic curve E over Q(t) has good reduction at p if there 
is a generalized Weierstrass equation for E over Z[t|;,) with discriminant a 
unit of Z[é];p). The reduction of this equation modulo pZ{é](p) then defines 
an elliptic curve E over F,(t). Now let E be an elliptic curve over Q(t) 
with invariant t and good reduction at p. For example we can take & to 
be the curve defined by the equation 


36 i 


2 vee i eg ence ey ee ae 
yor ty =2 ~~ Tg" F— 1798 


of discriminant t?/(t — 1728)°. Let P be a point of order N on E£, let 
P € E[N] be its reduction modulo p, and let K be the fixed field of 


{o € Gal(F,(t)/Fp(t)) : o(P) = +P}, 


where F,(t) denotes a separable algebraic closure of F,(t). Then X\(N) 
is characterized up to isomorphism as the smooth projective curve over F, 
with function field K. Furthermore, by viewing F as an elliptic curve over 
K one obtains a reduction map 


E(Fp)s' — Elli(N)(Ep)s 
Lihat [(E)« (P)z] 
for any subset S’ of P!(F) containing the places where E has bad reduction. 


Let S be the inverse image of S’ under the reduction map P!(Q) —> P1(F). 
Then the diagram of reduction maps 


Xi(N)(Q)s —— Elh(N)(Q) 


commutes. _ 
Henceforth we take S’ to be the set of places where F has bad or super- 
singular reduction. Note that S’ is a finite set. The commutativity of the 
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above diagram allows us to replace Ell,(N)(Q)ora and Ell,(NV)(F) in the 
statement of Proposition 13 by X1(N)(Q)s and X\(N)(Fp)s: respectively. 


——— 


Now let Ji(V) denote the reduction of J;(V) modulo p, identifiable with 
the Jacobian of X,(V). There is a commutative diagram 


Div?(X1 (N)(Q)s) Ss J(N)(Q) 


| | 


SSS SS 


Div?(Xi(N)(Fp)s') —— Ji(N) (Fp) 


in which the vertical arrows are reduction modulo p and the horizontal 
arrows send a divisor to the point on the Jacobian representing its divisor 
class. Since S” is finite, a is surjective. Let L € J,(N)(Q) be a torsion 
point of &power order; we must show that 


(Tp — 0% — p(p)oy*)(L) = 0. 


In fact it is enough to show that this equation holds after reduction modulo 
p, because reduction mod p is injective on ¢torsion. Write L = a(D) with 
D € Div®(X1(N)(Q)s). According to Proposition 13, the point (Tp — a» — 
DP(P) Op ')(D) reduces to 0 modulo p, and consequently so does the divisor 
(Tp — Op — P(p) oy *)(L). 


3. [L-FUNCTIONS 


Theorem 2 is at best an approximation to the Eichler-Shimura relations, 
because it refers only to Frobenius elements of Gal(Q/Q), not to the Frobe- 
nius correspondence in characteristic p (cf. [19], p. 17, formulas (I) and (II)). 
Nevertheless, it suffices for the application to LZ-functions, to which we now 
turn. 


3.1. The Hasse-Weil conjecture. Originally conceived of as an asser- 
tion about the zeta function of a smooth projective variety over a number 
field, the conjecture has since evolved into a more general statement about 
L-functions of motives. Here we shall restrict our attention to motives of a 
very special kind, namely motives afforded by H! of an abelian variety over 
Q and more generally products of such motives with Artin motives. To be- 
gin with we take the Artin motive to be trivial. Let A be an abelian variety 
of dimension g defined over Q, and recall that for every prime number @ 
one has an #-adic representation 


pe : Gal(Q/Q) — Aut(Ve(A)) = GL(2g, Qa), 
where V2(A) = Qy @z, Te(A) and T;(A) is the Tate module of A: 
T,(A) = lim A[é”]. 


Tr 
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We let p7 denote the contragredient representation on the dual space V;*(A) 
of V;(A). Given a prime number p, one defines a polynomial P,(A,t) € Z[t] 
by the formula 


F(A, t) = det ¢ = tpi(o51)|Ve (Ay) 


where £ is any prime number different from p, I(p) and o, denote respec- 
tively the inertia group and a Frobenius element of some prime ideal p of 
Q lying over p, and 


Vi(A)'®) = {u € VF(A) : pe(g)u =v for all g € I(p)}. 


That P,(A,t) is independent of the choice of p and o, follows by a straight- 
forward verification from the conjugacy under Gal(Q/Q) of the prime ideals 
lying over a given rational prime. Far deeper is the fact that P,(A,t) be- 
longs to Z[t] and is independent of the choice of 2 # p. Indeed we are 
able to make this assertion for all p, and not just for the p where A has 
good reduction, precisely because we have confined ourselves to the case 
of abelian varieties, for which Grothendieck’s semistable reduction theo- 
rem [9] is available: in the case of an arbitrary smooth projective variety, 
the analogues of P,(A,t) - defined using @-adic cohomology groups Hi(+) 
rather than the Tate module - are not yet known to be independent of 2 
when p is a prime of bad reduction and 7 > 1 (for 7 = 1 the #-adic coho- 
mology group is dual to the Tate module of the Albanese, so we are back 
to the case of abelian-varieties). Now write 


29 
P,(A,t) = [ [1 - ax, pt) 


with complex numbers a;,. One has 
laxpl< VB (1S i < 29) 


with equality if p is a prime of good reduction, whence the Euler product 


L(A, 8) a [26277 


converges in the region Re(s) > 3/2. Another consequence of the semistable 
reduction theorem is that one can associate to A a well-defined conductor 
N(A) and sign W(A) = +1 (cf. [17]). The definition of the “root number” 
W(A) requires the theory of local epsilon factors (6). 
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Conjecture 1. Put 
A(A, s) = N(A)$/?((2m)~$I'(s))9 L(A, 8). 


Then A(A,s) has an analytic continuation to an entire function of order 
one satisfying the functional equation 


A(A,s) = W(A)A(A, 2 — 8). 


It is also useful to have at hand a slightly less precise formulation of the 
conjecture, evocative of the state of affairs which prevails when H}(A) is 
replaced by the cohomology of an arbitrary smooth projective variety: 


Conjecture 1*. There exist: 


- a finite set S of prime numbers containing all primes where A has 
bad reduction, 
for each p € S, a polynomial 


2g 
P*(A,t) = [](- eft) € Zit] 


with |oz,| < p for all 4, 

a positive integer N*(A), and 
a sign W*(A) € {+1}, 

such that if 


L*(A,s) = |] Pp(A,p7*)71- [P49 
pes pes 
and 
A*(A,s) = N*(A)9/?((20)*T(s))9L*(A, 8) 


then A*(A,s) has an analytic continuation to an entire function of order 
one satisfying the functional equation 


A*(A, 8) =W*(A)A*(A, 2 — 3). 


We have included a bound on a7, in the statement of Conjecture 1* 
to ensure that if Conjecture 1 is true then N*(A), W*(A), and P3(A,t) 
coincide respectively with N(A), W(A), and F,(A,t). Indeed for all good 
p (and hence in particular for all p ¢ S) we already have the stronger 
information that |a;,p| = ,/p, so that the stated bound on aj, affords a 


uniform estimate 
= { loin] (p €S) 
lezp| (pe S); 
but a remark of Deligne-Serre ((7], p. 515, Lemme 4.9) then shows that 
N*(A), W*(A), and the P3(A,t) are uniquely determined by the func- 
tional equation, whence these quantities coincide with the corresponding 
quantities in Conjecture 1 whenever the latter conjecture is satisfied. 
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3.2. Modular forms. Quite apart from its significance for the arithmetic 
of abelian varieties, Conjecture 1 asserts the existence of a class of Dirichlet 
series with Euler products and functional equations. Such Dirichlet series 
arise naturally in the theory of modular forms. 


Let k be a positive integer. Given a holomorphic function f on 9 anda 
matrix 


_({@ 6 + 
CS cc i) eGhL (2,R), 


we put 


det(y) 
(cz + d)* 


This formula defines a right action of GLT(2,R) on the space of holomor- 
phic functions on §. Now let I be a subgroup of finite index in SL(2,Z). A 
modular form of weight k for T is a holomorphic function f on § satisfying 
two conditions: 


(i) fley =f for yer. 
(ii) For every 6 € SL(2, Z) the function f|,6 has a Fourier expansion of 


the form | 
(f|n6)(z) = S> a(n) e2*n2/M 


n>0 


(fley)(z) = f (92). 


If for every 6 € SL(2,Z) the coefficient a(0) in (ii) is 0 then f called a 
cusp form. The vector space of modular forms of weight k for I will be 
denoted M;,(I) and the subspace of cusp forms $,(I'). These spaces are 
finite-dimensional. We remark in passing that in condition (ii) the phrase 
“6 € SL(2, Z)” can be replaced by “6 € GL*(2,Q)”, where GLt(2,Q) = 
GL(2,Q) N GL*(2,R). This is simply a matter of writing an element of 
GL* (2, Q) as the product of an element of SL(2, Z) and an upper triangular 
matrix. It follows in particular that if [ is normalized by a matrix 6 € 
GL*(2,Q) then the spaces M,(I) and S,(I) are stable under the map 
fro f\kd. 

Let us now specialize to the case T =I, (V). In this case we denote the 
spaces M;(I’) and S,(T) simply by M,(V) and 5,(.V). Furthermore, given 
a character x of (Z/NZ)* we let M;,(NV,x) and S;(N, x) be the subspaces 
of M,(V) and 5;(JV) consisting of f such that 


fley = x(a) f 
for 
ie an i) ETo(N). 


(Implicit in the notation x(d) is the usual identification of characters of 
(Z/NZ)* with Dirichlet characters modulo N.) Another way to describe 


MobDULAR CURVES 77 


the subspaces M,,(V, x) and S;,(N, x) is to say that they are the y-eigen- 
spaces for the “diamond operators” f ++ f|,(d). In this approach d denotes 
an element of (Z/NZ)*, and the operator (d) is defined by setting 


f\n(@) = fle 


for any y € I'o9(NV) which reduces modulo N to a matrix with d as lower 
right-hand entry. In view of the isomorphism 


T'o(N)/Ti(N) — (Z/NZ)* 


coset of e +— d (mod N) 


the diamond operators give a well-defined action of (Z/NZ)* on M;(N) 
and S;(.V), and consequently we have eigenspace decompositions 


M,(N) = 0M: (N, x) 
and 
Sk(N) = 8S: (N, x) 


where x runs over Dirichlet characters modulo N. Note that if x is the 
trivial character then M;(N, x) and S,(N, x) coincide with M;,(To(V)) and 
S.(lo(N)) respectively. 

Henceforth we restrict our attention to cusp forms. To see why cusp 
forms give rise to Dirichlet series with functional equations, observe that 


the matrix 
0 -1 
wr=(y 0) 


normalizes [';(N), whence f|,Ww €S,(N) if f € 5,(N). In fact 


0 SEV @- ONO, ADO 1 ee. Se 
N QO cN d N 0O ~ \-bN a }’ 
so that f|,Wn € Sk(N, xX) if f € S.(N, x). Now write 


F(z) =) 7 a(n)e***n* 


n>1 


and 


(f|.Ww) (z) = Ss) b(n ere 


n>1 
and put 


A(s) = N*/?(2n)~*T(s) S° a(n)n 


n>1 
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and 
B(s) = N*/?(2n)~*T(s) S_ b(n)n-*. 
n>1 
As Hecke observed, these Dirichlet series converge absolutely in some right 
half-plane and can be analytically continued by a method which goes back 
to Riemann’s paper on the Riemann zeta function: The usual interchange 
of summation and integration shows that 


ats) = f° seevmerS, 


whence 


A(s) = f pin VS + [ fi] VNyeS 
= [ (eGR yes + se) VE) F 


on making the change of variables ¢ +> 1/t¢ in the integral from 0 to 1. Since 


i ee 
f (<=) = (it)*(FeWrv) (it/ VN) 


we obtain 
re sk MJ k—s - s dt 
A(s) -| (2°(flkWn)(it/VN)t*-* + f(it/VN)t rs 


But (Wy)? = —NI, and consequently f|,(Wn)? = (—1)*f. Hence one can 
repeat the preceding calculation with A(s) replaced by B(s), f by f|,.Wn, 
and f|,.Wy by (—1)*f, and a comparison of the resulting expressions for 
A(s) and B(s) yields: 


Proposition 14. The functions A(s) and B(s) have analytic continua- 
tions to entire functions of order one satisfying the functional equation 
A(s) =i* B(k — s). 


We have avoided calling the Dirichlet series }* a(n)n~* and }¢ b(n)n~* 
as L-functions, because as yet we have imposed no condition to guarantee 
the existence of an Euler product. For this we need the Hecke operators. 


3.3. Hecke operators. Given a prime number p, let A, denote the set 
of 2 x 2 matrices with integer coefficients and determinant p which are 
congruent modulo N to a matrix of the form 


(0 3) 
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It is immediate from the definition that A, is stable under left and right 
multiplication by [,(N), and elementary calculations show that if = 


Ty(N) and 
1 0 
os ¢ ) 


then A, has the one-term double-coset decomposition 
A, =T6,. 


and the following decomposition as a disjoint union of right cosets: 


fe tare > U Pp) @ a if pt N 


ae 1 
tl oF if p|N 
0 p 


(recall that if p does not divide N then (p) denotes an arbitrary element of 
T'o(N) with lower right-hand entry congruent to p modulo N). Of course if 
y €T\(N) and {6} is any set of representatives for the distinct right cosets 
of [;(N) in A, then {67} is another such set, because A, is stable under 
right multiplication by [',(N). 

The p-th Hecke operator 


Tp : Sx (N) ——> Sx(N) 
is defined by the formula 
fie Fes 
5 


where 6 runs over a set of representatives for the distinct right cosets of 
I'|(N) in A,. The definition is independent of the choice of coset repre- 
sentatives because f € S,(N). Furthermore, f|,T, does belong to S,(NV), 
because right multiplication by any -y € I (V) sends one set of right coset 
representatives to another. For much the same reason, T, commutes with 
the diamond operators (d), whence each subspace S;(N, x) is stable under 
T,: since '9(N) normalizes both A, andI, (NV), conjugation by an element 
of 9(N) sends one set of right coset representatives for [j(N) in A, to 
another. To exhibit the effect of T, on Fourier expansions, suppose that 
f €5,(N, x) and write 
F(z) =} a(n)q” 


no>1 


with g = e?™. A straightforward calculation using the right coset repre- 
sentatives listed above gives 


(f lk Tp) (2) = Ss; a(pn)q” acd x(p)p*"} Dp a(n)q?”. 


n>1 n>1 
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Note that if p divides N then x(p) is to be interpreted as 0 in keeping with 
the usual conventions for Dirichlet characters modulo N. In the literature 
T, is often denoted U, in this case and the preceding formula is written 


(fleUp)(z) =D) a(pn)q” — (PIN). 
n>1 
The notation U, has the advantage of forestalling an ambiguity which in 
principle could arise when N = pM, p{ M, and f € 5,(M): in this 
situation the expression f|,7 can have two possible meanings depending 
on whether we regard f as belonging to S,(M) or to S,(N). Neverthe- 
less, we shall continue to use the notation T, for all primes p, leaving the 
appropriate interpretation to context. 

By a Hecke etgenform we shall mean a nonzero element of S,{N, x) which 
is an eigenvector of the operators T, for all primes p. If f = 5° a(n)gq” is 
a Hecke eigenform and A, is the eigenvalue of T, on f, then the above 
formula for f|,T'(p) gives 


a(pn) — Apa(n) + x(p)p*~*a(n/p) =0 (n> 1), 
where a(n/p) is understood to be 0 if n is not divisible by p. Taking n = 1 
we see that a(p) = Apa(1), so that a(1) = 0 implies a(p) = 0. More 
generally, using induction on the total number of prime factors of n one 
finds that if a(1) = 0 then a(n) = 0 for all n > 1, whence f = 0. Therefore: 


Proposition 15. If f = }),5, a(n)q” is a Hecke eigenform then a(1) # 0. 


A Hecke eigenform f = }> a(n)q” is said to be normalized if a(1) = 1. 
The proposition implies that if f is any Hecke eigenform then some scalar 
multiple of f is normalized. For a normalized eigenform the relation a(p) = 
Apa(1) becomes A, = a(p), whence the recursion formula for a(n) becomes 


a(pn) — a(p)a(n) + x(p)p*~*a(n/p) = 0. 
Taking n = p’~! with v > 1 one sees that 
a(p”) — a(p)a(p’~*) + x(p)p**a(p’-?) = 0, 
and then taking n = p’~!m with m relatively prime to p one deduces 
by induction on v that a(p’m) = a(p”)a(m). A further induction on the 
number of distinct prime factors of some ! relatively prime to m shows that 
a(im) = a(l)a(m). In other words, the function n +> a(n) is multiplicative; 
the associated formal Dirichlet series has an Euler product: 
S> a(n)n~* = [ [> a(n’ )p~”). 
n>1 Pp v>0 
On the other hand, the recursion relation for a(p”) amounts to the formal 
identity 
S > a(p’)p-”* = (1 — a(p)p* + x(p)p*""p-?8)-?, 
v>0 
and substitution in the preceding equation gives one direction of the fol- 
lowing equivalence (the other is obtained by reversing the argument): 
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Proposition 16. For an element f(z) = ops, a(n)e?™™* of S.(N), the 
following are equivalent: 


(i) f ts a normalized Hecke eigenform. 
(ii) SO p51 a(n)n—* = [],(1 — a(p)p7* + x(p)p**-78) +. 


If f is a normalized Hecke eigenform then the Dirichlet series in (ii) is 
called the L-function of f and denoted L(f,s). {From Proposition 14 we 
know that there is a functional equation relating the L-function of f to a 
Dirichlet series associated to f|,Wn, but we do not know that the latter 
Dirichlet series has an Euler product. Thus it remains to find conditions 
under which both f and f|,Wwy are Hecke eigenforms. Such conditions 
are provided by the theory of new forms. Thestarting point-is to define a 
suitable inner product on 5;(NV). 


3.4. The Petersson inner product. Put 
Sk =(JSx(L), 
r 


where the union is taken over all subgroups of finite index in SL(2,Z). We 
define an inner product (*,*) on S; as follows. Given f,g € S,, choose a 
subgroup I of finite index in SL(2,Z) such that f and g both belong to 
S,(T), and put 


(fa) =(8L(2,2) sr) [Fler )(aCoTer Bar, 
I'\GL+(2,R) 

where dr denotes the measure on [\GL*(2,R) afforded by a Haar measure 
on GL*(2, R) (recall that GL*(2, R) is unimodular — a left Haar measure is 
a right Haar measure). Using the fact that f and g are cusp forms, one can 
check that the integral is absolutely convergent. Furthermore, the factor 
[SL(2, Z) : T]~! in front of the integral guarantees that the value of (f, g) 
is independent of the choice of [. Now if 


r=(3 4) 


(fler)(6)(fler)(@) = If (a + iy) Py, 


and consequently (f, f) > Oif f 40. Thus («, *) is in fact an inner product. 
Since we have not specified a choice of Haar measure on GL* (2, R), we have 
defined (*, *) only up to a scalar multiple; this suffices for our purposes. 

Next we observe that if f,g € S, and 6 € GL*(2,Q) then f|,6 and 
g|~6 + both belong to S; and 


(fled, 9) = (Ff, 9|n6"*)- 


with y > 0 then 
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Indeed choose T° of finite index in SL(2,Z) so that f,g € S,(T). Then 
the groups IY = ['n67!16 and Ir’ = [nN 6167! also have finite index 
in SL(2,Z) and satisfy 6©’6~1 = I’. Hence we can express (f|x6,g) and 
(f,9|,6 1) as integrals over I’\GL*(2,R) and ['’”\GL*(2, R) respectively, 
and the stated formula follows from the left-invariance of Haar measure on 
GL*t(2,IR). More generally, taking y, 7’ € T and replacing 6 by 67’, we find 
that F 


(fle767',9) = (Ff, g|k0-*), 


because f|xy = f and g|x(7’)~* = 9. 

Let us apply the preceding formula with T =1T',(N) and 6 € Aj, where 
p is a prime not dividing N. Since A, is equal to a single double coset of 
I’, we have A, =TéT and consequently 


(f|n6", 9) = (f,9|K6"*) 


for any 6’ € Ap. It follows that 


(1) (fle Tp, 9) = (p+ 1p"? (fF, 91467"), 
because f|,Tp is the sum of p + 1 terms of the form p*/?-!f|,6’. Take 
6 = 6, in (1), and as usual, let (p) denote any element of '9(N) with lower 


right-hand entry congruent to p modulo N (and hence with upper left-hand 
entry congruent to p-! modulo N). Since 


(pI) (p)(5p)~* € Ap 


formula (1) becomes 


(2) (fleTp, 9) = (p + 1)p*/? (Ff, (glee) *) 19) 
with some new element 6 of A,. On the other hand, repeating a previous 


argument we see that (f,(g|x(p)—!)|h 767’) is independent of y, 7’ € TI’, and 
consequently that 


(3) (p+ 1)p*/?*(f, (gle (p)~*)e5) = (F, (gle) *)le Tp). 
Together, (2) and (3) give 
Tp = (p)~'Tp, 
where 7’, denotes the adjoint of T, on S,(N) with respect to (+, *). Since 


the diamond operators commute with the Hecke operators, we conclude 
that for p not dividing N the operators T, are normal. We also obtain: 
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Proposition 17. Let f € S,(N,x) be a Hecke eigenform and p a prime 
not dividing N. If Xp is the eigenvalue of Tp on f then »» = X(p)Ap- 


As a commuting family of normal operators, the operators T, (p { N) 
are simultaneously diagonalizable on 5;,(V). However, simultaneous diag- 
onalization of the 7; for all primes p, including those dividing N, is a more 
delicate matter and is possible in general only on a subspace of 5,,(NV), the 
subspace of new forms. 


3.5. New forms. Consider positive integers M andr such that M divides 
N properly and r divides N/M, and put 


r 0 
y= ( 0 < : 
The calculation 


r 0 a b\(r 0\' [( a. br 
(5 i} (ew a) (ot) =(ame &) 
shows that the map f +> f|,V;- sends S,(M) to S;,(N), indeed each sub- 
space 5S;,(M,x) to the corresponding subspace of 5;,(N). In fact a glance 
at Fourier expansions shows that if p does not divide N then (f|kTp)|kV; = 
(fliVr)|elp, so that fr f|,V, sends eigenvectors of T, to eigenvectors of 
T,- The need for a distinction between “old forms” and “new forms” arises 
because this last assertion fails for p dividing N. 

The space of old forms of level N is by definition the subspace S,(N)°!4 
of S,(.V) spanned by the images of the maps f + f|,V, as M and r vary 
over all integers satisfying the divisibility conditions stated above. In other 
words, 


S.(N)™ = span} (J () (fle: f © Se(M)} 


M|N r|N/M 
M<N 


A Hecke eigenform belonging to S;(N)°!4 is called an old form of level N. 
The space of new forms of level N, denoted S,(N)"°”, is the orthogonal 
complement of S;,(N)°4 in S,(N) relative to the Petersson inner product. 
A Hecke eigenform belonging to S;,(N)"°™ is called a new form of level N, 
and a normalized new form of level N is called a primitive form of level N. 
Let Prim,(N) denote the set of primitive forms of weight & and level N. 
One of the main theorems of the theory of new forms is that Prim,(N) is 


a basis for S,(N)"°”; as a corollary one deduces that the set 


LU U fle: f € Prim,(M)} 


M|N r|N/M 
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is a basis for all of S,(V). Results such as these are important to mention 
here because they show that the theory of new forms is nonvacuous, but 
for present purposes the result of primary interest is the following theorem, 
which will lead us to a functional equation for the £-function of a primitive 
form: 


Theorem 3. Given f € Prim,(N), 9 € Sz(N), and a finite set S of prime 
numbers such that g is an eigenvector of T, for p ¢ S, suppose that the 
eigenvalues of T, on f and g coincide forp ¢ S. Theng is a scalar multiple 


of f. 


For the proof, the reader is referred to the literature on new forms: 
Atkin-Lehner [2], Casselman [4], Li [14], and Miyake [15]. The application 
to L-functions starts from the observation that if f € S,(N,x) and we set 


f(z) = f(-2) 


then f € S,(N,xX). This follows from the identity —7z = 7/(—Z), where 
y € GL*(2,R) and 7 is obtained from + by negating the diagonal entries. 
One also verifies that the map f + f is unitary with respect to (*,*) and 
preserves S;(N)°!¢, whence it preserves S,(N)"°” as well. Now at the level 
of Fourier expansions the map f ++ f has the form 


S> a(n)e?"*" 2 ys aen™. 


n>1 n>1 


Hence on applying complex conjugation to the formal identity in part (ii) 
of Proposition 16, we see that if f is a normalized Hecke eigenform, then 
so is f. Since S,(N)®°” is stable under f ++ f we conclude that Prim,(N) 
is stable under this map also. 

Suppose now that f € S,(N,x). We shall compare f and f|,Wn. For a 
prime p not dividing N, let Aj, denote the set consisting of 2 x 2 matrices 
with integer coefficients and determinant p which are congruent modulo N 


to a matrix of the form 
p * 
0 1/)° 


Ww ApWx? = Al = Ap(p)7?. 


A calculation shows that 


Since Wy normalizes | (V) we deduce that if {6} is a set of representatives 
for the distinct right cosets of [;(N) in A, then both {WyéW,;'} and 
{5(p)~'} are sets of representatives for the distinct right cosets of '(N) 
in Aj. It follows that 


f\lkWnleTp = X(P) fl eToleWv- 
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Thus f is an eigenvector of J, for all primes p not dividing N then so 
is f|.Ww, and if Ap is the eigenvalue of T, on f then X = x(p)Ap is 
the eigenvalue of T, on f|,;Wy. Referring to Proposition 17, we see that 
a = As: and then Theorem 3 implies that f|,Wwy is a scalar multiple of 


f. We shall write the scalar in question as 7~*W(f), so that 
ts 


f\lhWn =i *W(f)f. 


Then Proposition 14 gives: 
Proposition 18. Given f € Prim,(N), put 


A(f,s) = N8/2(2n)~ST(s)L(f, 5). 


Then A(f,s) has an analytic continuation to an entire function of order 
one satisfying the functional equation A( f,s) =W(f)A(f,k—s). 


We have reached the limits of what can be done to suggest a possible 
connection between modular forms and Conjecture 1 on the basis of formal 
analytic properties alone. The next step is to make a connection between 
modular forms and modular curves, or at least between cusp forms of weight 
2 and regular differentials on modular curves. 


3.6. Differentials and cusp forms of weight 2. To begin with let T 
be any subgroup of finite index in SL(2, Z) and let a denote the restriction 
to § of the natural map H* — [\H*. If w is a regular differential on ['\9* 
then m*w = f(z)dz for some function f on 9. We claim that the functions 
f which arise in this way are characterized by the following conditions: 


(o) f is holomorphic. 

(i) f(yz)d(yz) = f(z)dz for y eT. 

(ii) Suppose that 6 € SL(2,Z), and let M be a positive integer such 
that 


(f 06)(2 + M)d(6(z + M)) = (f 06)(z)d(6z) 


(such an integer exists by (i)). Let Ff’ be the holomorphic function 
on the punctured unit disk D° = {q¢ € C: 0 < |q| < 1} defined by 


d ; 

f(5z) 62 = F(e?™/M) (ze Hi). 
z 

Then F extends to a holomorphic function on the full unit disk 

D={q¢EC:0< |q| < 1} vanishing at 0. 


Indeed (i) says that the differential f(z)dz on § descends to a differen- 
tial on I'\, while (0) is the condition for the descended differential to 
be holomorphic (at an elliptic fixed point of [ the equivalence between the 
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holomorphy of w and the holomorphy of f requires a small verification). As 
for (ii), its content is that the descended differential extends holomorphi- 
cally from ['\ to I'\H*. Again there is a small verification: if we assume 
without loss of generality that M is minimal, then the change of variables 


w = 62, q = e?**8 '“/M defines a local parameter at 600, and condition (ii) 
dq d M 

is a consequence of the fact that dw = Bee ye ——.. Thus (0), (i), and 
q dz 271 


(ii) do characterize the functions on 4 obtained by pulling back regular 
differentials from I*\. Now if 7 = & :) is an element of GLt(2, R) 


then 


det 
d(yz) = ek Zs 


Therefore condition (i) can be rewritten 


fley=f (yeT), 


while in (ii) the requirement is the existence, for any 6 € SL(2,Z), of a 
Fourier series expansion of the form 


(f|26)(z) = Ss) alnje*™inzl™. 


n>1 


Returning to the equation m*w = f(z)dz, we conclude that as w runs over 
the space of regular differentials on [\* the function f runs over the space 
of cusp forms of weight 2 for I’. 

Let us now specialize to the case ! = 1,(N). We shall write H°(Q) (yy) 
for the space of regular differentials on X(N) defined over Q, and similarly 
H (2% n/c) for the corresponding space over C, so that 


A°(2% wwe) =C 8g H° (2X, (wy): 
The isomorphism just described gives an identification 


(2% we) = S2(Pi(N)), 


and on the right-hand side we have an action of the Hecke operators Ty. 
As we shall now explain, the Hecke correspondences determine operators 
on the left-hand side (to be denoted T, also) such that the above isomor- 
phism respects the action of Tp. Quite generally, if JT’ = (Z,y,w) is a 
correspondence on a smooth projective curve X, then T gives rise to the 
operator 

H°(0) — HO) 


we try(y*w), 
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where tr, is the trace on differentials associated to the morphism y: if 
K Cc Lis the inclusion of function fields afforded by y then any a € 
H°(Q3) has the form a = udv with u € L and v € K, and by definition, 
tr,(a) = trz/x(u)dv. Returning to the case at hand, we see that the Hecke 
correspondence T, = (Xi(N,p),%p, Wp) on X(N) determines an opera- 
tor Tp on H°(Q4, (w)) and hence by extension of scalars an operator on 


H°(Q%, wy): 


Proposition 19. The canonical isomorphism 
H°(Q% yy) = S2(N) 


commutes with the action of T,. 


Proof. We begin with a general remark. Suppose that X is a smooth 
projective curve over a subfield of C and T = (Z,, p) is a correspondence 
on X. Let d be the degree of y. Since the base field is a subfield of C, we can 
discuss the correspondence T in the language of Riemann surfaces, and in 
particular we can speak of the local analytic sections (p~')az,,; (1 <i < d) 
of y in a neighborhood of some unramified point rg € X(C). Then 


(a) (o> aga) @ 


l<i<d 


locally at zo. Thus for w ¢ H°(Q}j¢) we have 


Tw) = SO (Wi Jaga. 


l<i<d 


This formula makes it possible to compute the action of T on differentials 
directly from a knowledge of the map T : X(C) —> Div(X(C)). Indeed if 
the latter map is given locally by 


tr S" (Pa»,i(2)) 
1<i<d 
with analytic functions p,,,;, then after a permutation of indices we have 
Papi = 0° (Yj ')ag,¢ and consequently 


T(w) = De Pry i 


1<i<d 
locally at x. 

In the case at hand we can identify x1(N)(C) with [\(NV)\9*, and in 
the coordinate z of § the map T, : X;(N)(C) —> Div(41(N)(C)) is given 
by 


fz] Pole +v)/pl + [(p)pz] if pt N 
paol(2 + v)/PI if p|N 
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(Proposition 9). This map can be written simply as 
[ze] > S"[o2], 
6 


where 6 runs over a set of representatives for the right cosets of [,(V) 
in A,. Now we have already observed that if f is a function on 9 and 
7 € GL*(2,R) then f(yz)d(yz) = (fley)(z)dz. Thus for f € S2(V) we can 


write 
LA §z)d(5z) = LFh6)@) dz = (fl2Tp)(z)dz 


the factor p*/?-1 in the definition of T, being 1 fork = 2. This proves 
Proposition 19. 


Using Proposition 11 one proves the analogous statement for the diamond 
operators: 


Proposition 20. The canonical isomorphism 
F°(Qk, cy) = S2(N) 


commutes with the action of (d) for d € (Z/NZ)* 


One consequence of Propositions 19 and 20 is that So(N) has a Q 
form stable under the operators T, and (d), because H°(Q\ (yy jc) has 


such a Q-form, namely H (2%, ( yy): It follows that if S2(NV, x) contains a 
Hecke eigenform on which the operator T, has eigenvalue A, then for any 
a € Aut(C) the space So(N, x7) contains a Hecke eigenform on which the 
operator T, has eigenvalue A>. Now the Fourier coefficients of a normal- 
ized Hecke eigenform are polynomials with integer coefficients in the Hecke 
eigenvalues and the character values. Hence we can define an action of 
Aut(C) on the set of normalized Hecke eigenforms in S2(V) by the rule 


f= So a(n)? > f? = So a(n)?q". 


n>1 n>1 


We claim that if f is a new form of level N then so is f%. Suppose on 
the contrary that f° is an old form. Then there is a proper divisor M of 
N and an element g = 50,5, (n)q” of Prime(M) such that a(p)? = b(p) 
for p + N. Then a(p) = b(p)” for p+ N, whence g® is a normalized 
Hecke eigenform in S2(@) which has the same Hecke eigenvalues as f for 
pi{N. This contradicts Theorem 3, proving the claim. We conclude that 
the map f ++ f? defines an action of Aut(C) on Primg(N). Since Prima(N) 
is finite, it follows that if f € Primg(N) then the field generated by the 
Fourier coefficients of f has finite degree over Q. We denote this field Ey. 
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3.7. The Hecke algebra. Given a smooth projective curve X, we will let 
Corr(X) denote the free abelian group on the set of isomorphism classes of 
correspondences on X. If T = (Z,y,w) and T’ = (Z',y', W’) are correspon- 
dences on X we define the product of their isomorphism classes [T”] - [T] 
by the formula 


[T’]-[T] = Sow, poprz ovy,y' oprg o vy], 
Ww 
where W runs over the irreducible components of 


2" ={(z,2') EZ x Z': V2) =9'(2)}, 


vy : W — W is the normalization map, and prz : Z" = Z, prg : Z" 3 Z' 
are the projections. Extending this product to Corr(X) by Z-linearity, we 
make Corr(X) into a Z-algebra. We shall view Aut(X) as a subgroup of 
the multiplicative group of Corr(X) by identifying » € Aut(X) with the 
isomorphism class of the correspondence (X,idx,w) on X. 

In the case of X, (JV) we are interested in the subalgebra of Corr(X1(N)) 
generated over Z by the isomorphism classes of all Hecke correspondences 
Tp and all diamond automorphisms (d). We denote this subalgebra by T, 
and refer to it as the Hecke algebra (of level NV). Furthermore, we use the 
same symbol T and the same term “Hecke algebra” for the image of T under 
the canonical embedding of Corr(X,(N)) in End(J,(NV)), and we likewise 
identify the opposite algebra T°PP with its image in End(H°(QY (yy)). Al 
ternatively, we can view T itself as acting on the dual space of H ( ( ny) 
or we can consider T to be acting on H°(Q). (yy) on the right. This last 
point of view is consistent with our identification of H°(Q} (yyjc) with 
End(S2(N)) (Propositions 19 and 20), and we may therefore think of T as 
the subring of End(S2(N)) generated over Z by the Hecke operators and di- 
amond operators on Se(N). It follows in particular that T is commutative, 
so that T and T°PP are canonically isomorphic and every left T-module is 
a right T-module. 

The next step is to associate a quotient ring T; of T to each f € 
Primg(N). Consider the ring homomorphism A; : T — C such that 
floT = A;(T)f for T € T, and let If be the kernel of A. We set 

Ty; =T/I;. 
Thus Ty is the quotient of T by the annihilator ideal of f. Write f(z) = 
n> a(n)q”, and recall that E, = Q({a(n) : n > 1}). If So(N,x) is the 
character space to which f belongs then A; induces an isomorphism 
Q @z Ts — E; 
sending Tp +I to a(p) and (d) +1 to x(d). Let Ay be the abelian variety 
over Q defined by 
Az = J(N)/IpAi(N). 


The action of T on J;(N) induces an action of T; on A; and hence on each 
Ve(Ap). 
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Proposition 21. The image of Ty in Ey is an order of Ey, and Ve(Af) 
is a free module of rank two over Qe @zT;. In particular, Ay is an abelian 
variety of dimension [Ey : Q], and Af is an elliptic curve if and only if the 
Fourier coefficients of f are rational. 


Proof. The second statement is contained in the first because Vg(A;) is a 
vector space of dimension 2 dim(A;) over Qz, while 


dimg, Q, @z Ty = rankzT ; = [IE : Q). 


To prove the first statement we start with the observation that as a subring 
of End(Ji(NV)), the Hecke algebra T acts on Hy(Ji(N)(C),Z) and conse- 
quently also on H,(X,(N)(C), Z), the two homology groups being isomor- 
phic via the map on homology induced by the embedding of X,(N)(C) in 


J\(N)(C). Denoting the complex dual of B(Q%. cwyse) by (2% wyje)*s 
we see that the standard isomorphism of complex tori 
Bo} * 
ANC) Xi(Ny/e) 


~ -EA(Xi(N)(C),Z) 
is actually an isomorphism of T-modules. Hence so is the isomorphism 
(1) Ji(N)(C) = So(N)*/A, 


where A is the image of H,(X,(N)(C), Z) when we identify (Yon) 
with $(2,N)*. The fact that the lattice A in So(N)* is stable under T 
already shows that the eigenvalues of T on S2(N) are algebraic integers, 
because eigenvalues are preserved under transpose. It follows that the 
image of Ty in Ey is an order. 
Next put 
Vz = Sa(N)/(Se(N) lal), 

where S2(NV)|2I denotes the space of all gloT with g € So(N) and T € T. 
We identify V7 with the quotient of So(V)* by IfS2(N)*, and we let A; be 
the lattice in VF corresponding to A/I;A under this identification. Then 
(1) induces an isomorphism of T;-modules 


(2) A;(C) = VF /A;. 


We claim that V; (hence also V;*) is a free module of rank one over C@T;. 
Granting the claim, we deduce that V; is free of rank two over R @ Ty. 
Since V7 = R @ A; it follows that there is a sublattice A C A; which is 
free of rank two over Ty. But (2) gives 


Te(As) = lim(€-" Ay) /Ay = Ze @ Ay. 


Th 
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Therefore 
Ve(Az) = Qe @Az =~ Qe’ Ay, 


and the proposition follows. 
It remains to prove the claim. The semisimple ring C @E; is canonically 
a product 


where the factors are indexed by the distinct embeddings of E; in C. Pro- 
jection onto the factor corresponding to o gives a character pr, : C@E; — 
C sending T, +I; to a(p)” and (d) + If to x(d)°, and a simple C @ E;- 
module is a one-dimensional complex vector space on which C @ E; acts 
through one of the characters prs. Now as a finitely generated C @ E;- 
module V; is a direct sum of simple modules and is therefore spanned 
over C by eigenvectors with eigencharacters of the form prz. Suppose that 
u € V; is such an eigenvector. Then v is in particular an eigenvector for 
the family of operators 7; = {T, +I; :p +N}. But the action of 7; on 
Vz; = So(N)/(Sa(N)l2l¢) is induced by the action of TJ = {T, : p{ N}on 
So(N), and as a commuting family of normal operators J acts semisim- 
ply on S2(NV). It follows that v is the image in V; of some T-eigenvector 
g € So(N). Then Theorem 3 implies that g is a scalar multiple of one of 
the cusp forms f’. It follows that the restriction to @,Cf*? of the natural 
map of S2(N) onto V; is surjective. But the restriction is also injective, 
because 


(@oCf7) M (Sal N)lalls) C (@oCF7) M (So(V)"*" alls) = {0} 
by the theory of new forms. Therefore V; is isomorphic to ®,Cf? as a 
C @ Ey-module and is consequently free of rank one. 


We are now ready to compute the Euler factor of Ay at a prime of good 
reduction: 


Theorem 4. Let f € So(N,x) be a primitive cusp form of level N, with 
Fourier expansion 


f(z) = = a(n)e2™*” 


n>1 
If p is a prime not dividing N then 
Pp(Az,t) = | [(1 — a(p)7t + x(p)p#?), 


where a runs over the distinct embeddings of Ey in C. 


Proof. Fix a prime @ and let pg denote the natural representation 


pe: Gal(Q/Q) — VA). 
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It will suffice to prove that for a prime p not dividing £N we have 


(1) det(xI — pe(op)) = | [(x? — a(p)?2 + x(p)p), 


lex 


where « is an indeterminate, p is a prime ideal of Q lying over p, and 
Op € Gal(Q/Q) is the Frobenius automorphism at p. Indeed the left- 
hand side of (1) coincides with the characteristic polynomial of p;(o,") on 
Ve(Ay), because a matrix and its transpose have the same characteristic 
polynomial. Also Vg(As) = Ve(A,)!) by the criterion of Néron-Ogg- 
Shafarevich [18]. Hence if z is replaced by 1/t and the equation multiplied 
by ¢?sQ] then (1) becomes the stated formula for P,(Ajz,t), valid for any 
prime p not dividing NZ. Since @ was arbitrary the stated formula follows 
for any prime p not dividing N. 

To prove (1) we recall a fact from linear algebra. Suppose that B is 
an (mn) x (mn) matrix which can be written as an m x m block matrix 
B = (B") with n x n blocks B*. Suppose further that the ring generated 
over Z by the matrices BY is commutative. Then 


det B = det (detmxm(B)), 


where detmxm(B) denotes the determinant of the m xm matrix over Z[BY] 
with ij-entry equal to BY’. On replacing B by zI—B we obtain the formula 


det(rI — B) = det(detmxm(zI — B)) 


for the characteristic polynomial of B. 

To apply this formula, recall that Ve(A;) is a free module of rank two 
over Q, @ T; and observe that pg(o,) is a Qe @T;-linear transformation of 
Ve(Af). Let deto, gr, (zl — pe(op)) denote the characteristic polynomial of 
pe(op) as a Qe @T;-linear map. Then 


det (x = pe(Fp)) = Ne, eT; /Q, (deta, eT; (xI ae: pe(Ip)), 


where Ng,@r,/g, is the norm from Q, @T;[z] to Q(z] (which coincides on 
T;[z] with the norm from T;[z] to Q{x]). To prove (1) it suffices to show 
that 


(2) detg,er, (tI — pe(op)) = 2” — Tp + (p)p, 
because Nr, g(x? — Tpz + (p)p) is the right-hand side of (1). 
Write 
Qe QT; => [[Era: 
re 


where 4 runs over the places of Ey dividing @ and Ey idenotes the com- 
pletion of Ey at A. Also put o, = pr, ope, where pr is the projection map 
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from Qe @T; to the factor Ey, on the right-hand side. Then equation (2) 
is equivalent to a system of equations indexed by the places A, namely the 
equations 


(3) det(2I — pr(oy)) = 2° —Tp,az + (p) ap 


with Tp, = pra(Tp) and (p), = pra((p)). It follows from Theorem 2 that 
the right-hand side of (3) annihilates p,(o,). Furthermore, if a nonscalar 
2 x 2 matrix over a field is annihilated by a monic polynomial of degree 2 
then that polynomial is its characteristic poynomial. Therefore (3) holds 
whenever (dp) is nonscalar. Now fix a place Ap dividing @ and let Po be 
the set of primes p not dividing NZ such that p),(op) is scalar (note that 
this condition is independent of the choice of p). It remains to show that 
(3) holds for X = Ap and all p € Fp. 

Let e9 € Qe @ T; be the idempotent which generates the kernel of the 
map 


II pra : Q, @T; — II Ey, 


Xa peat 


and choose an integer v > 0 such that the element dp = £”eg belongs to 
Ze @ Tz. Since Ve(A;) is free of rank two over Q, @ T; it follows that the 
Ze-module 

doTe(As) = lim do A;[2”] 


Tt 


has a Ze-submodule of finite index which is free of rank 2[E;, : Qg]. In 
particular, putting 
A,{e] = UL Agle"), 


n>1 


we see that dy A;[£~] is infinite. 

Put L = Q(dpA;[@~]). Then the torsion subgroup of A;s(L) contains 
dyA;[@™] and is consequently infinite. Hence a theorem of Ribet [16] im- 
plies that L is not contained in the maximal cyclotomic extension of Q. 
Therefore the group G = Gal(L/Q) is nonabelian. Let Frobr(P 9) be the 
set of Frobenius elements of prime ideals of L lying over primes in Py, and 
let H be the closure of the subgroup of G generated by Frobz(Fo). We 
claim that H is abelian, whence H is a proper subgroup of G. Indeed 
Pr, can be viewed as a faithful representation of G on doVg(Af), and since 
the restriction of p,, to H is scalar the claim follows. Now let Py be the 
complement of Po in the set of prime numbers not dividing NZ. Also let 
Frobz(Po) C G be the set of Frobenius elements of prime ideals of L ly- 
ing over primes in Pp. Then the Chebotarev density theorem implies that 
the set G — H is contained in the closure of Frobz(Po). Since any group 
is generated by the complement of a proper subgroup, it follows that the 
subgroup generated by Frobz (Po) is dense in G. 
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Next we consider two continuous homomorphisms Gal(Q/Q) —> EF oe 
The first, to be denoted «),, is obtained by composing the ¢-adic cyclotomic 
character Gal(Q/Q) —> Z;‘ with the inclusion of Zf in Ef,. For the 
second character, we compose the canonical surjection 


Gal(Q/Q) — Gal(Q(un)/Q) = (Z/NZ)* 


with the map 
(Z/NZ)* —> T; 


d+— (d) 


followed by pra,. This second character will be written o +> (c),,. Note 
that if p is a prime not dividing NZ then «,,(o,) = p and (oy),, = (P)ro- 
On the other hand, if p happens to belong to Po, then det p (Op) = (BP) roP, 
because equation (3) holds for A = Apo and p € Py. Therefore, writing 
Frobg(Po) for the set of Frobenius elements of prime ideals of Q lying over 
primes in Fo, we have 


det Pro (dy) = (Tp) ro Ko (op) 


for op € Frobg(Po). Since both sides of this equation are continuous, 
equality holds on the closure of the subgroup of Gal(Q/Q) generated by 
Frobg(Po). Let us consider the image of this subgroup under the natural 
map Gal(Q/Q) — Gal(L/Q). The image of Frobg(Po) is Frob, (Pp), and we 
saw above that the subgroup of Gal(L/Q) generated by Frobz(Po) is dense 
in Gal(Z/Q). Thus the closure of the subgroup of Gal(Q/Q) generated by 
Frobg(Po) maps onto Gal(L/Q). We conclude that if p is any prime not 
dividing NZ then det p,,(7p) = (p) roP. 

We can now prove that equation (3) holds for \ = Ao and all p € Fy. 
Indeed if B is any nonzero 2x 2 matrix over a field which is annihilated by a 
monic polynomial of degree 2, and if the constant term of that polynomial is 
the determinant of B, then the polynomial is the characteristic polynomial 
of B. This completes the proof. 


3.8. Modular abelian varieties. Let us now complete the train of 
thought initiated in Theorem 4. Let f € Se(N,x) be a primitive cusp 
form of level N with Fourier expansion 


f(z) = S > a(njeom™. 


n>1 


For a prime p dividing N we define 


Ps(Ay,t) = [[(1— a(p)7%). 


fez 
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We also put g = [Ey : Q], N*(As) = NY, and W*(As) = [[, W(f?), 


that 
L*(Aj,8) = Tew 
and 
A*(Ag, 8) = (N*(Ag))*/? (2m) T'(s))9L* (Ay, 5). 
Then 


A*(Ay,8) = [ave 


Now according to Proposition 18, each A(f7,s) has an analytic continua- 
tion to an entire function of order one satisfying the functional equation 
A(f7,s) = W(f7)A(f7", 2— 8), where p € Aut(C) denotes complex conju- 
gation. Since composition with p merely permutes the distinct embeddings 
of Ey in C, we deduce that A*(A;,s) has an analytic continuation to an 
entire function of order one satisfying the functional equation 


A*(Ay,8) = W*(A,)A*(Ap, 2 — 5). 


Consequently A; satisfies Conjecture 1*. However, it follows from a theo- 
rem of Carayol [3] (completing work of Deligne [5], Ihara [11], and Lang- 
lands [12]) that P,(A;y,¢) = [[,(1 — a(p)°t) for p dividing N, and fur- 
thermore that N(A;) = N9 and W(As) = [[, W(f7). Thus a stronger 
assertion holds: 


Theorem 5. For f € Primg(N) and g = [E; : Q] the invariants 
L(Aj,8), N(Ay), and W(Ay) 


[[27"7,5), %, and |] w(f?) 


respectively, where o runs over the distinct embeddings of Ey in C. Conse- 
quently A; satisfies Conjecture 1. 


coincide with 


Let Primg denote the union of the sets Primg(N) over all positive in- 
tegers N, and let A be an abelian variety over Q. If A is isogenous over 
Q to a product of abelian varieties of the form A; with f € Prime, then 
we call A a modular abelian variety, or in the case of dimension one, a 
modular elliptic curve. Since the L-function, conductor, and root number 
of A depend on A only up to isogeny over Q, and since all three of these 
invariants respect products, we deduce: 


Corollary. If A is a modular abelian variety then A satisfies Conjecture 
i; 


In the remaining paragraphs we discuss a partial converse to the corol- 
lary in the case of dimension one, the converse being contingent on a suit- 
able strengthening of Conjecture 1. 
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3.9. Conjecture 1 with twists. As we have already mentioned, Conjec- 
ture 1 is a special case of a more general hypothesis about L-functions of 
motives. We shall now state a slight extension of Conjecture 1 (still far from 
the general case) in which we allow twists of the motives in Conjecture 1 by 
Artin motives. For the application we have in mind it would suffice to con- 
sider Artin motives corresponding to Dirichlet characters, but specializing 
the context in this way does not seem to simplify the formulation. 

Consider as before an abelian variety A over Q together with its associ- 
ated family of ¢-adic representations {pg}. In addition, let + be a continuous 
finite-dimensional complex representation of Gal(Q/Q), and let E, C C be 
a finite extension of Q such that 7 is realizable on an E.,-vector space W. 
If A is a place of E, lying over some 2 and E,.) is the completion of E, at 
\ then we obtain a representation pg @T of Gal(Q/Q) on the E,,)-vector 
space 

Uy = (E,,, @g, Ve(A)) @ (E,,, @z, W). 


Given a prime p, we choose £ 4 p and put 
Pp(A,7,t) = det (1 —t(pe ® 7)(op)|05) . 


As before, the semistable reduction theorem implies that the coefficients 
of P,(A,7,t) lie in E, and are independent of ¢ and A. Furthermore, the 
complex numbers a;,, in the factorization 


2g dim r 
P,(A,7,t)= J] (1—oi,pt). 
i=1 
still satisfy 
lain] <f/p (1<i<2gdimr), 
so that the Euler product 


L(A,7,8) = || Pp(4.7,p79)? 
Pp 


converges for Re(s) > 3/2. Also, the conductor N(A,7) of the compatible 
family {pe @ T}e is defined, as is the root number W(A,7r), which is a 
complex number of absolute value 1 (no longer necessarily equal to +1 
unless 7 is equivalent to its contragredient 7*). If the conductors N(A) 
and N(r) of A and 7 are relatively prime, then 


N(A,7T) = N(A)%™7 N(r)?9 


and 


W (A, 7) = det r((—1)9 N(A))W(A)9™" 7 W (1), 


where in the second equation W(r) is the root number of r and det is 
thought of as a Dirichlet character. 
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Conjecture 2. Put A(A,7,s) = N(A,7)9/?((2m)~ST'(s))9%™7 L(A,7, 8). 
Then A(A,T, $) has an analytic continuation to an entire function of order 
one satisfying the functional equation 


A(A,7, 8) = W(A,7)A(A,7*, 2 — 8). 


For A = A; and certain 7 with solvable image a statement along these 
lines follows from the Rankin-Selberg method and the theory of base change 
(cf. [1], [13], [21]). If A = Ay and 7 is one-dimensional then Conjecture 2 
is subsumed in the results of Carayol [3]. 


3.10. Epilogue: the Shimura-Taniyama conjecture. Let us now 
consider Conjecture 2 in the special case where dim A and dim7v are both 
one. Thus A is an elliptic curve and 7 can be identified with a primitive 
Dirichlet character x. We shall further assume that the integers N = N(A) 
and r = N(x) are relatively prime, whence 


N(A,x) = Nr? 


and 
W(A,x) = x(-N)W(A)W(x)?. 


In this setting the assertion of Conjecture 2 has a particularly elementary 
formulation. To begin with, let us put 


1—|A(F,)|+p if A has good reduction at p 


) ) if A has split multiplicative reduction at p 
a = 
—1 if A has nonsplit multiplicative reduction at p 
0 if A has additive reduction at p. 


Then the Euler factors of A are determined by the elementary rule 


P(A, t) = { 1—a(p)t+pt? if A has good reduction at p 
ore aame Glee 62); if A has bad reduction at p. 
Therefore 
L(A,s)= [[ (1-a(p)p-S +p-?8)-?- T(t apps. 
pPtN(A) p| N(A) 


Furthermore, since we are assuming that r is relatively prime to N, the 
L-function L(A, x, s) coincides with the naive twist of L(A,s) by x: if we 
write L(A, s) as a Dirichlet series 


L(A, s) = 50 a(n)n~*, 


n>1 
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then 


L(A,x, 8 => x( n)a(n)n-*. 


n>1 
Thus in the case at hand Conjecture 2 asserts that the function 


A(A, x, 8) = (Nr*)*/? (2m) ~*T'(s) S > x(n 


n>1 


is entire of order one and satisfies the functional equation 
A(A, x, s) = x(—N)W(A)W (x)? A(A, X, 2 — 8). 


Now compare this assertion to condition (i) of the following result, which 
is a version of Weil’s converse to Hecke theory specialized to the case of 
weight 2 and trivial character: 


Theorem 6. Let N be a positive integer and a(1),a(2),a(3),... a@ sequence 
of complex numbers satisfying the formal identity 


S\a(n)n~* = |] (1—a(p)p-* + p?*)? - [1 — (pps)? 


n>1 pan p|N 
Suppose furthermore that 


2p ifp{N 


lat )l < { pif pln, 


so that the Dirichlet series and Euler product actually converge for Re(s) > 
2. Put 
f(z) = Ss? a(n)e2™*r, 
n>1 


Then the following are equivalent: 


(i) There exists a compler number W(f) of absolute value 1 such that 
for every positive integer r prime to N and every primitive Dirichlet 
character x modulo r, the function 


A(F,x,8) = (Nr?)°/?(2m)~*T(s) $> x(n)a(n)n~* 


n>1 


has an analytic continuation to an entire function of order one 
satisfying the functional equation 


A(f,x,8) = x(—N)W(f)W (x)? ACF, X, 2 — 8). 


(ii) f ts a primitive cusp form of weight 2 for To(N). 
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Theorem 6 can be pieced together from Weil [22], Deligne-Serre ([7], p. 
515, Lemme 4.9), and the theory of new forms ([2],[4],{14],[15]). It applies 
in particular to the situation at hand, because if a(p) is the coefficient of 
p * in the L-series L(A, s) of an elliptic curve A over Q, then 


2,/p ifptN 
la(p)| s { 1 if plN, 


which is a stronger estimate than that required by the hypothesis of the 
theorem. Thus conditions (i) and (ii) are equivalent for L(A,s), and if we 
grant Conjecture 2 then it follows that there is a primitive cusp form f for 
I'p(NV) of weight 2 such that L(f,s) = L(A, s). Now this equation implies in 
particular that the Fourier coefficients of f are rational, whence E; = Q and 
Ay; is an elliptic curve. Furthermore, Theorem 5 gives L(A;,s) = L(A,s), 
and then the isogeny theorem of Faltings implies that A is isogenous over 
Q to A;. Thus A is a modular elliptic curve. To summarize, if we grant 
Conjecture 2, then we are forced to believe: 


Conjecture 3. Every elliptic curve over Q is modular. 
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GALOIS COHOMOLOGY 
LAWRENCE C. WASHINGTON 


In these lectures, we give a very utilitarian description of the Galois 
cohomology needed in Wiles’ proof. For a more general approach, see any 
of the references. 

-Pirst we fix some notation. For a field K, let K be a separable closure 
of K and let Gx =Gal(K/K). For a prime p, let G) = Gg,, where Q, is 
the field of p-adic numbers, and let I, C Gp be the inertia group. 

Let G be a group, usually either finite or profinite, and let X be an 
abelian group on which G acts. Such an X will be called a G-module. 
If there are topologies to consider, we assume the action is continuous, 
though we shall mostly ignore continuity questions except to say that all 
maps, actions, etc. are continuous when they should be. 


§1. H®, H!, anp H? 
We start with 


H°(G,X) = X° = {x € X|gz =z for all g € G}. 
For example, Gx acts on K* and 
B(GE IC yak. 
For another example, let , denote the group of n-th roots of unity. Then 


{+1} if 2|n, 


H°(Go, Un) = 
(Ga, Hn) a if2tn. 


Occasionally, for a finite group G, we will need the modified Tate coho- 


mology group 
H°(G, X) = X°/Norm(X), 

where Norm(z) = 5° geg gz (if X is written additively). For example, if X 
is an abelian group of odd order on which Gal(C/R) acts, then Norm(X) D 
2(XC) = XS, so H°(Gal(C/R), X) =0. 

We now skip H!(G, X) in order to give a brief description of H?(G, X). 
Define 

H?(G, X) = cocycles/coboundaries, 
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where a cocycle is a map (of sets) f: G x G— X satisfying 


Of = f(91, 9293) — f(9192,93) + 91° f(92,93) — F(91, 92) = 9, 


and where f is a coboundary if there is a map h: G — X such that 


f (91,92) =91 - h(g2) — h(gige) + h(gi) = oh. 


This definition might seem a little strange; we will give a slightly different 
form of it later after we define H'(G, X). 

Here is an example. Let p be prime and let G = Gp. Let a,b € QF with 
a not a square. Define 


b if gVa=—Va and g2./a = —VJa, 
F(n, 92) = . 

1 otherwise. 
It is easy to check that f : Gp x Gp — QP satisfies the cocycle condition, 
hence yields an element of H?(Gp, Q*). Suppose 6 is a norm from Q,(./a), 
so b = z* — ay” for some z,y € Qy. Let h(g) = 2+ yVa if ga = —Va 
and h(g) = 1 otherwise. Then 


f (91,92) = (gih(g2))h(91)/h(g192), 


so the element of H? we obtain is trivial. Conversely, it can be shown 
that if this element is trivial, then b is a norm from Q,(/a). Recall the 
Hilbert symbol (a, b),, which equals 1 if b is a norm from Q,(./a) and equals 
—1 otherwise. Thus the above cohomology class we obtain is essentially 
the same as the Hilbert symbol. We also have (a,b), = 1 if and only if 
x? — ax} — br? + abr3 = 0 has a non-zero solution in Q,. Equivalently, 
(a, b)p = 1 if and only if the generalized quaternion algebra Q,[i, j,k], with 
i = a, j* = b, k* = —ab, ij = k, etc., is isomorphic to the algebra of 
two-by-two matrices over Q, (rather than being a division algebra). In 
general, H?(Gx, K™) is known as the Brauer group and classifies central 


simple algebras over the field K. We will need the following result. 
Proposition 1. Let p be a prime number. Then H?(Gp, Q*) ~ Q/Z. 


This result is an important result in local class field theory. For a proof, 
see [Se]. In our example, the cohomology class of f is 0 if (a,b), = 1 and 
is } mod Z if (a,b), = —1. 

We now turn our attention to H!, which is the most important for us. 
Define 

H'(G,X) = cocycles/coboundaries, 


where a cocycle is a map f : G— X satisfying f(g1g2) = f(91) +91 f(g2) 
(a “crossed homomorphism”) and where f is a coboundary if there exists 
z € X such that f(g) = gt — «=. 
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. Before continuing, we write the cocycle conditions in a different form 
that perhaps seems more natural. For a 2-cocycle f, let 


F(a,b,c) =a- f(a~'b,a7'c), 


where a,b,c € G. Then F(ga, gb, gc) = g- F(a, b,c) and the cocycle condi- 
tion becomes 


F(a,b,c) — F(a,b, d) + F(a, c,d) — F(b,c,d) =0. 


For a 1-cocycle f, let F(a,b) = a- f(a~'b). Then F(ga,gb) = g- F(a,b) 
and the cocycle condition reads 


F(a, b) — F(a,c) + F(b,c) = 0. 


We can even describe H® in this manner: a 0-cocycle is a map f from 
the one point set to X, hence simply an element z of X, that satisfies 
gz —x = 0. If we let F(a) = az, then F(ga) = g- F(a) and F(a)—F(b) =0 
for all a,b € G. In all three cases, the coboundary condition says that F' is 
the coboundary of a function from the next lower dimension. For example, 
the function F' for a 2-coboundary is of the form H(a, 6) — H(a,c)+H(b,c) 
for a function H satisfying H(ga,gb) = g - H(a,b) (explicitly, H(a,b) = 
a-h(a~'b) in the above notation). It should now be clear how to define 
higher cohomology groups H"(G,X) for n > 3. With one exception, we 
will not need these higher groups, and in this one exception, the element 
we need will be 0; therefore, we may safely ignore them for the present 
exposition. 

A fundamental fact that will be used quite often is the following. Sup- 
pose 

0O-—-A-B—C-—0 


is a short exact sequence of G-modules. Then there is a long exact sequence 
of cohomology groups (write H™(X) for H"(G,X) ) 
0 -» H°(A) — H°(B) — H°(C) — H}(A) 
— H'(B) > H'(C) > H(A) > H?(B) > --- 
The proof is a standard exercise in homological algebra. 
Let’s return to H!(G, X). Suppose the action of G is trivial, so gz = x 
for all g and x. Then cocycles are simply homomorphisms G > X. A 


coboundary f(g) = gx — x is the 0-map. Therefore we have proved the 
useful fact that 


H'(G,X)=Hom(G,X) _ if the action of G is trivial. 


Here “Hom” means (continuous) homomorphisms of groups. For exam- 
ple, let K be a field and let G = Gx. Then Gx acts trivially on Z/22Z, 
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so H!(Gx,Z/2Z) = Hom(Gx,Z/2Z), which corresponds to the separable 
quadratic (or trivial) extensions of K; namely, if f is a non-trivial homo- 
morphism, then the fixed field of the kernel of f is a quadratic extension. 
The trivial homomorphism corresponds to the trivial extension K/K. 

Suppose now that G is a finite cyclic group: G = (g) with g* = 1. The 
cocycle relation yields by induction that 


f(g)=(Q+g+g9? +---+9°")f(Q). 


Therefore f(1) = f(g”) = Norm(f(g)). The cocycle condition easily im- 
plies that f(1) = 0, so_f(g) is in the kernel of Norm. Any such choice for 
f(g) yields a cocycle via the above formula. A coboundary corresponds to 
f(g) = (g —1)z for some x € X. Therefore 


H'(G,X) ~ (Kernel of Norm)/(g—1)X for a finite cyclic group G. 


As an example, consider a Gp-module X of odd order. Let c be complex 
conjugation. Write X = +#X @ 45£X. Note that +5£X is the kernel of 
Norm = 1+, and is also equal to (c —1)X. Therefore H'(Gp, X) = 0. 
More generally, it can be shown that if G and X are finite with relatively 
prime orders, then H*(G,X) = 0 for all i > 0, and also for i = 0 if we use 
the modified groups H°(G, X). 

When G is infinite cyclic, or is the profinite completion of an infinite 
cyclic group, and X is finite, then there is a similar description. Let g be 
a (topological) generator. Let z € X be arbitrary. There are k,n > 0 such 
that gx = x and kr = 0. Define a cocycle by f(g‘) = (l+g+---+g*")z 
fori > 0. If i > 7 andi = j mod kn, then g? + ---g*! is a multiple 
of L+g™ +---+g™(*-)), which kills 2. Therefore f(g*) depends only on 
z mod kn, so f extends to a continuous cocycle on all of G. Since, as above, 
every cocycle must be of this form, we have 


H1(G,X) ~ X/(g—1)X 


when G is (the profinite closure of) an infinite cyclic group and X is finite. 
This result will be applied later to the case where F is a finite field and 
G = Gal(F/F), which is generated by the Frobenius map. 

Let L/K be a finite extension of fields with cyclic Galois group G gener- 
ated by g. Then G acts on L*. The famous Hilbert Theorem 90 says that 
if ¢ € L* has Norm 1 then z = gy/y for some y € L*. This is precisely 
the statement that H!(G, L*) =0. More generally, we have 


H'(Gal(L/K), L*) =0 


for any Galois extension of fields L/K ([Se]). 
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_ Let n > 1 be prime to the characteristic of the field K and consider the 
exact sequence of Gx-modules 


bo SOS Re Sa 


induced by the n-th power map. The long exact sequence of cohomology 
groups includes the portion 


H®(Gx,K*) —? H°(Gx,K*) cr H' (Gx, pn) =? H'(Gx,K*), 


where the first map is the n-th power map. Since the last group is 0, we 
find that 
H" (Gx, tn) & K*/(K* J". 


Explicitly, let a € K* and fix an nth root a of a. Then g++ ga/a defines a 
cocycle and hence an element of H'(Gx, fin). When un C K, H' (Gx, pn) 
becomes Hom(Gx, pn), which corresponds (in an obvious many to one 
fashion) to cyclic extensions of K of degree dividing n, and a@ is a Kummer 
generator for this extension (and, correspondingly, there are several Kum- 
mer generators mod nth powers for each extension). When n = 2, note that 
Z/2Z and po are isomorphic as G~x-modules, and we find that H'(Gx, ye) 
classifies quadratic extensions of K, though in a slightly different manner 
than H} (Gx, Z/2Z). 


§2. PRELIMINARY RESULTS 


Suppose H is a (closed) normal subgroup of a group G and X is a G- 
module. Then X# is a module for G/H in the obvious way. A cocycle for 
G/H can also be regarded as a cocycle for G (“inflation”) by composing 
with the map G — G/H. A cocycle for G can be regarded as a cocycle 
for H by restriction. Also, G/H acts on H'(H,X) by the formula f9(h) = 
g-f(g~'hg), where f is a cocycle and g is a representative of a coset in 
G/H. An easy calculation shows that if g’ is another representative of 
the coset of g then f% and f9 differ by a coboundary, so the action is 
well-defined. 


Proposition 2 (Inflation-Restriction). There 1s an exact sequence 


0— H1(G/H,X#) = H'(G,X) - BMA, X)o/# 
— H?(G/H, X") — H?(G, X). 


This is the exact sequence of terms of low degree in the Hochschild- 
Serre spectral sequence, hence is sometimes referred to by that name. For 
a proof, and the definition of the map from H! to H?, see [Sh]. 

For example, let p be a prime and let G = G,. Let H = Ip = 
Gal(Q, / Q,"), where Q5™ is the maximal unramified extension of Qp, so 
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I, is the inertia subgroup of Gp, and G,/I, ~ Gal(F,/F,). The beginning 
of the above sequence implies that 


H*(Gp/Ip, X"?) ~ Ker(H*(Gp,X) > H? (Ip, X)). 


Thus we can regard H!(G,/Ip, X?”) as the subgroup of H'(G,, X) consist~- 
ing of those cohomology classes that become trivial when restricted to the 
inertia subgroup; hence, we call these the unramified classes. For example, 
when X = Z/22Z, the unramified classes are those homomorphisms from G, 
to Z/2Z that are 0 on Ip, hence that can be identified with homomorphisms 
from G,/Ip to Z/2Z. There are two such homomorphisms, the 0 homo- 
morphism and the one corresponding to the unique unramified quadratic 
extension of Q, (or of F,). This is well-known, but is also a consequence 
of the following, which often allows us to calculate the order of the group 
of unramified classes, since H°(Gp,X) = X°r. 


Lemma 1. Let X be finite. Then #H!(Gp/Ip, X'°) = #H° (Gp, X) (and 
both are finite). 


Proof. There is an exact sequence 


0 apie uh els (Frob —1) 


X!» —, X!» /(Frob—1)X — 0. 

The exactness at the first X/* follows from the fact that ifs € X/> and 
(Frob—1)z = 0, then z is fixed by both J, and Frob, which (topologi- 
cally) generate Gp. The first term gives H°(G,, X) and the last term gives 
H}(Gp/Ip, X'”). The result follows easily. O 


The last preliminary topic that we need is cup products. In general, 
suppose X,, X2, and X3 are G-modules, and there is a G-module homo- 
morphism ® : X| ® Xz — X3. The cup product is a map 


H'(G, X,) x H?(G,X2) — H**3(G, X3). 


We define the cup product only when z+ 7 = 2, since this is the main case 
we need. Let f; € H*(G, Xi), so we may regard f, as (being represented 
by) amap f;:GxG— X,. Let ro € XE = H°(G, Xe). Then fz = fiUze 
is the 2-cocycle satisfying f3(91,92) = ®(fi(91,92) ® 22). The cup product 
of H® and H? is defined similarly. Now let @, € H'(G, X,) for k = 1,2. 
Define 

(1 U d2)(g1, 92) = ®(1(g1) ® 1 $2(92))- 


It is easy to see that this defines a 2-cocycle, hence an element of H?(G, X3). 

For example, let a,b € Q>. Let ge H} (Gp, Z/2Z) be defined by 6(g) = 
0 if g(,/a) = Va and d(g) = 1 otherwise. Define w € H'(Gp,p2) by 
w(g) = g(vb)/Vb. We may regard 2 ~ Hom(Z/2Z, pe) as the dual of 
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Z/2Z; hence there is a map Z/2Z @ p2 > po C QX. Therefore dUp € 
H?(Q), Q*). Fix a square root Vb and let h(g) = (gvV’b)*9. A calculation 
shows that ¢ Uw multiplied times the coboundary h(g1) - 91h(92)/h(g192) 
equals the cocycle f defined earlier, the one corresponding to the Hilbert 
symbol (a,b). In fact, this cup product is one way to define the Hilbert 
symbol; see [Se]. We now have a pairing 


H"(Gp,Z/2Z) x H*(Gp, u2) —+ H?(Gp, Q*) ~ Q/Z. 


The non-degeneracy of this pairing is equivalent to the non-degeneracy of 
the Hilbert symbol. 

Now let p be odd and consider the group H'(G,/Ip, Z/2Z) of unramified 
classes. Assume a is not a square. The element ¢ is in this group if /a 
generates an unramified extension (in fact, the unique quadratic extension) 
of Q,, which means we may assume a is a p-adic unit. We have (a,b), = 1 
<=> bis a norm from Q,(./a) <> 6 is a square times a p-adic unit 
(this follows from the fact that p is a uniformizer for Q,(./a)) <> the 
cocycle 7 is unramified. Therefore, the unramified classes in H'(Qp, 12) 
form the annihilator of the unramified classes in H'(Q,,Z/2Z) under the 
above pairing. All of this will be greatly generalized in the next section. 


§3. Loca, TATE DUALITY 
Let p be prime and let X be a G,-module of finite cardinality n. Let 
X* = Homz(X, pun), 


where Gp acts on X* by (g2*)(x) = g(x*(g~*x)). Note that X @ X* = 
Lin © QF as Gp-modules. 


Theorem 1 (Local Tate Duality). (a) The groups H*(G,, X) are finite 

for allt >0, and =0 fori >3. 

(b) Fort =0,1,2, the cup product gives a non-degenerate pairing 
H"(Gp, X) x H?-*(Gp, X*) > H?(Gp, Qh) = Q/Z. 

(c) Jf p does not divide the order of X then the unramified classes 

H}(Gp/Ip, X”*) and H"(Gp/Ip, (X*)/r) 

are the exact annihilators of each other under the pairing H'(Gp,X) x 

H}(G,, X*) — Q/Z. 

Proof. For a proof, see [Mil. 


For the archimedean prime, the groups H*(Gp, X) are finite for all 7. If 
we use the modified group H° in place of H®, then we have #H°(Gp, X) = 
#H'(Gp,X) for alli > 0. There is a non-degenerate pairing 


H'(Gp, X) x H!(Gr, X*) — Q/Z, 
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and also : 
H° (Gp, X) x H?(Gp, X*) — Q/Z 


(and with H° and H? reversed); note that we use the modified H® here 
also. 


Another result we need evaluates Euler characteristics. 


Proposition 3. Let p be prime and let X be a finite Gp-module. Then 


#H' (Gp, X) #H'(Gp,X) 


ee EN oe oe Mc ae a a Son a Se eee Up (HX) 
#H? (Gp, X)- #H?(Gp,X) #H(Gy,X)-#H(G,,X*) 7 


Proof. The first equality follows from Theorem 1. For a proof of the propo- 
sition, see [Mil]. 


By using Theorem 1 and Proposition 3, we can evaluate #H'(Gp, X) 
and #H*(G»p, X) in terms of #H°(G,, X) and #H°(G»p, X*). These are 
much easier to calculate in most cases. 


§4. EXTENSIONS AND DEFORMATIONS 


The main reason that Galois cohomology arises in Wiles’ work is that 
certain cohomology groups can be used to classify deformations of Galois 
representations. In order to explain this, we need a few concepts. 

Suppose G is a group acting on an abelian group M, and assume in 
addition that M is a free module of rank n over a ring R (commutative 
with 1), and the action of G commutes with the action of R. The action 
of G is then given by a homomorphism 


p:G—GL,(R). 


This yields an action of G on M,(R), the ring of n x n matrices, via 
z ++ p(g)zp(g)~'. Let Adp denote M,(R) (or Endr(M)) with this action. 
We also will need the submodule Ad’ p consisting of matrices with trace 0. 
An extension of M by M will mean a short exact sequence 
(oS say poly M — 0, 
where & is an R[G]-module and a and @ are R[G|-homomorphisms. The 
equivalence of two extensions is given by a commutative diagram 


ee es. AP es 


1 o4 4 
Oi et a i SS sg. 
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where y is an R[G]-isomorphism. The set of equivalence classes of such 
extensions is denoted Ext'(M, M). 

Let Re] denote the ring R[T]/(T?) (so «? = 0). An infinitesimal defor- 
mation of p is an extension of p to 


p’:G— GL, (Re) 


such that p’ maps to p under the map e€ ++ 0. Two such infinitesimal 
deformations p’ and p” are equivalent if there is a matrix A = JI mode 
such that Ap’ A~1 = p’’. The idea behind this is that we want to fit p into 
a family of representations. Suppose, for example, that A is a local ring 
with maximal ideal M, and that we can extend p to p: G — GL,(R[T]) 
(or R{[T]] if R is complete). Then we can evaluate T at anything in the 
maximal ideal M and get a representation congruent to p mod M. The 
infinitesimal deformations are the first steps in the direction of constructing 
such families. 


Proposition 4. The following sets are in one-one correspondence. 
(a) H'(G,Adp). 

(b) Ext’(M, M). 

(c) Equivalence classes of infinitesimal deformations of p. 


Proof. Consider an extension 0 — M — EF Dor 0. Since M is 
free over R, there is an R-module homomorphism ¢ : M — EF such that 
Godgd=idy. Letgé€ Gandme M. Since @ is an R[G|-homomorphism, 
g¢(g~'m) — 6(m) is in (Ker 6). Let T, : M — M be defined by 


T,(m) = a~* (gb(g~'m) — o(m)). 


It is easy to check that Ty, 4. = Ig, +91T4,, where the action of G is the one 
on Adp. Therefore g + Ty gives an element of H'(G, Adp). If we have two 
equivalent extensions and ¢; and @¢2 are the corresponding maps, and 7, 
and Tz are the corresponding cocycles, then (T2)y —(T1), = g/ — W, where 
w= a~!y~!(¢2 — 761) : M — M. Therefore Tz — T; is a coboundary for 
Ad p, hence T; and T» represent the same class in H'(G,Adp). Therefore 
we have a well-defined map Ext'(M,M) — H1(G, Adp). 

Note that the trivial extension HE = M @M (as R[G|-modules) yields 
the trivial cohomology class. 

We remark that this method of obtaining cocycles is fairly standard; 
namely, take an element, such as ¢, in a bigger set, in this case Hom(M, £), 
and form gé — d. Something of this form will automatically satisfy the 
cocycle condition, but of course we also want g¢é — ¢ to be in the original 
set. When ¢ itself is in the original set, in this case Adp, the cocycle is a 
coboundary. 

Now suppose we have two extensions &, and EK, and corresponding 
cohomology classes T, and T2, and suppose these classes are equal. Then 
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there exists an R-map w: M — M such that (72), —-(T1)g =gw — yp. Let 
e; € Ey. We can uniquely write e; = a,;(m) + ¢1(m’) with m,m’ € M. 
Define y(e1) = a@2(m) + ¢2(m’) — aa(w(m’)). A calculation shows that 7 : 
FE, — Ez is an R[G]-homomorphism that makes the appropriate diagram 
commute (and is therefore an isomorphism, by the Snake Lemma); hence 
the extensions are equivalent. We have proved that the map Ext'(M, M) > 
H}(G, Ad p) is an injection. 

Finally, let g  C(g) € Adp be acocycle. Let fF = M@rR{e] = «MOM. 
We regard p(g) as an element of GL,,(R[e]) via the natural containment 
GL,(R) C GL, (R[e]). The matrix I + «C(g) is also in GL,(R[e]), so we 
define 

p'(9) = (I + €C(9)) (9). 


This is easily seen to be a homomorphism, and gives an action of G on E. 
We have the short exact sequence 

set Sy Fe ay EE 
Let dé: M— E=eM@OM be the map to the second summand. Then the 
above recipe gives 


Ta(m) = €~*( (1 + eC(g)) Cg) (0(g)~ tm) — 4(m)) = C(g)(m). 


Therefore this extension yields the cocycle C, so the map Ext'(M,M) > 
H}(G, Ad p) is surjective. 

The above shows that a cocycle yields an infinitesimal deformation. Con- 
versely, if p’ : G — GL,(R[e]) extends p, define C(g) by I + €C(g) = 
p'(g)p(g)~!. An easy calculation shows that C' is a cocycle. The identity 


(I + €A)(I +€C) p(I —€A) = (I+ e(A— pAp-'+C))p 


shows that equivalence of deformations corresponds to equivalence of coho- 
mology classes. Note that the trivial cohomology class corresponds to the 
trivial deformation p’ = p. This completes the proof. O 


One of the themes in Wiles’ work is to consider deformations with var- 
ious restrictions imposed. By the above, this corresponds to considering 
cohomology classes lying in certain subsets of H!(G,Adp). For the mo- 
ment, we consider two such examples. 


Example 1. Suppose we want to consider deformations where the deter- 
minant remains unchanged. Note that det((J + eC’)p) = (1+ €Tr(C))detp. 
Keeping the determinant unchanged is equivalent to having C € Ad°p. 
Since Ad(p) = Ad° p@R, where R represents the scalar matrices with triv- 
ial action of G, we have H!(G,Adp) = H1(G,Ad° p) @ H1(G, R). From 
the above, H!(G, Ad° p) gives the classes of infinitesimal deformations with 
fixed determinant. 
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Example 2. Let p be prime and consider a cohomology class 
C € H'(Gp/Ip, (Ad p)’*), 


which is the kernel of the restriction map H'(G,,Adp) — H*(Ip, Adp). 
Let p’ be the corresponding deformation. Then p’ restricted to I, is (equiv- 
alent to) the trivial deformation: p'|;, = p|r,. Therefore p’ is unramified 
at p if and only if p is unramified at p (i-e., plz, is trivial). Moreover, if p 
is ramified, all the ramification of the deformation p’ comes from that of p. 
We will often require certain cohomology classes to be unramified in order 
to control the ramification of the corresponding deformations of p. 


§5. GENERALIZED SELMER GROUPS 


Let X be a Gg-module. Eventually, X will be Ad° p, but for the moment 
we do not need to make this restriction. As indicated above, we want to 
study cohomology classes in H'(Gg, X) with various local restrictions. For 
each place £ of Q, including the archimedean one, we may regard the group 
Gz as a subgroup of Gg. There are many ways to do this, but all the results 
we obtain will be independent of these choices. We have the restriction 
maps 


rese : H'(Gg, X) — H' (Ge, X). 


Let £ = {Le} be a family of subgroups Lz C H!(G,, X) as £ runs through 
all places of Q, with Lg = H!(Ge/Ie, X“*) for all but finitely many £. Such a 
family will be called a collection of local conditions. Define the generalized 
Selmer group 


Hz7(Q, X) = {2 € H' (Gg, X) | rese(x) € Le for all 2}. 


Let £* = {Lt}, where L} is the annihilator of Ly under the Tate pairing. 
By Theorem 1, Lt = H!(Ge/Ig,X*") for all but finitely many & The 
following result is crucial in Wiles’ proof. It was inspired by work of Ralph 
Greenberg [Gr]. 


Theorem 2. The group H}(Q, X) is finite, and 


#HMOX) _ #H%GqX) py ithe 
#H}..(Q,X*) #H(Gg, X*) ee #H(Ge, X) 
Note that #H°(Ge, X) = #H}(Ge/Iz, X"*) by Lemma 1, so almost all 
factors in the product are 1. The formulation of the theorem is that of 
[DDT], which differs slightly from that of [Wi]. An easy exercise, using 
Theorem 1 and Proposition 3, shows that the two versions are equivalent. 
We sketch the proof of the theorem at the end of the paper. 
In the applications, £ is chosen so that H}. = 0. Since the terms on the 
right are fairly easy to work with, we obtain information about the group 
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H}, which for appropriate X describes deformations of representations with 
certain local conditions. 

To show how the formula may be used, we now give an application in 
a fairly concrete setting. The techniques are much in the spirit of those 
used by Wiles. Let X = Z/p"Z (with trivial Galois action), where p is 
an odd prime. Let S be a finite set of primes containing p and co. For 
£€S, let Le = H'(Ge,Z/p"Z). For £ ¢ S, let Le = H'(Ge/Ig,Z/p"Z). 
Then Ly = 0 for 2 € S and Le = H}(Ge/Ie, tpn) for £ ¢ S. Consider 
H im (Q, Lp” ) : 

From above, we know that every element of H'(Gg, up») is represented 
by a cocycle of the form g++ ga/a, where a?” = a € Q*. To be in H}., 
it must be unraniified everywhere. Since 


H? (Ie, pn) = H* (Goyer up) ~ (Qe )*/((QE"")*)”", 


where Q7” is the maximal unramified extension of Q,, this implies that 
ve(~) =0 mad p” for all £. Therefore a = pth power in Q (we can ignore 
+1 since p is odd) and the cocycle represents the trivial cohomology class. 
It follows that H}.(Q, pn) = 0. 

We now evaluate the right side of the formula. First, 


#H?(Go,Z/p"Z) = #Z/p"Z = p”. 


Since we chose p to be odd, H°(Gg, tpn) = 0. In the product, the terms 
for 2¢ S are all 1. When £4 co is in S, the factor is 


#H" (Ge, Z/p"Z) 0 n 
fC ee n) - gve(P”) 
#HO(Ge,Z/prZ) 9% (Coben) 

by Proposition 3. The number of p”th roots of unity in Q, is (2—1, p”), so 
this is the order of H°(Ge, up. ). Since #Hom(Gpr,Z/p"Z) = 1, the factor 
for £= co is 1/p”. Putting everything together, we find 


#H}(Q,Z/p"Z) =p" |] (€-1,2"). 


LES\oco 


Note that H'(Gg,Z/p"Z) = Hom(Gg, Z/p”Z) classifies cyclic extensions 
of degree dividing p”, and H}(Q,Z/p"Z) gives those extensions that are 
unramified outside S. 

We already have a good supply of such extensions coming from subfields 
of cyclotomic fields. For each finite prime @ € S, there is a cyclic extension 
of degree (€ — 1, p”) contained in the 4th cyclotomic field. There is also 
a cyclic extension of degree p” contained in the p”*+!st cyclotomic field. 
These extensions are disjoint, so we obtain an abelian extension of exponent 
p™ and degree p” |[pco(@ — 1, p")- The Galois group of this extension 
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has this many homomorphisms into Z/p”Z, so all homomorphisms of Gg 
into Z/p"Z unramified outside S are obtained from subfields of cyclotomic 
fields. By enlarging S arbitrarily, we find that every cyclic extension of Q 
of degree dividing p” is contained in a cyclotomic field. The same analysis 
may be done for powers of 2 with the same result. Since every finite abelian 
group is a product of cyclic groups of prime power order, we obtain the 
Kronecker-Weber theorem that every abelian extension of Q is contained 
in a cyclotomic field. (Of course, this proof is by no means elementary, 
since the full power of class field theory is used in the proof of Theorem 2.) 
As in the proof of the Kronecker- Weber theorem just given, it will some- 
times be necessary to enlarge the set of primes at which ramification is 
allowed. The following estimates how much the Selmer group increases. 


Proposition 5. Let p be prime and suppose #X is a power of p. Let 
L = {Le} be a collection of local conditions and let q # p be a prime for 
which Lg = H}(Gq/Iq,X™). Define a new collection L' = {L}} of local 
conditions by Li = Le if £#q and Lj = H'(Gq,X). Then 


#Hp (Q, X) 0 “ 
HAGA) Re 
Proof. Since iT = 0, the conditions defining H},. are more restrictive 
than those defining H}., so H},. has order less than or equal to the or- 
der of H}.. When CL is changed to CL’ in Theorem 2, all factors on the 
right remain the same except the one for g, which changes from 1 to 
#H!(Gq,X)/#H°(Gp,X). By Proposition 3, this equals #H°(G,, X*), 
since q{ #X. The result follows easily. O 


§6. LOCAL CONDITIONS 


From now on, fix a finite set © of primes (including co, though this 
will not be important). Let p be an odd prime and assume R is a finite 
ring of cardinality a power of p. We will work with X = Ad°p, where 
p: Gg —GlLe(R) is a 2-dimensional representation. We also assume p is 
an odd representation. For our present purposes, we take this to mean that 
if c is (any choice of) complex conjugation, then the matrix p(c) is similar 
to ze 0 

0 -l 
Define a collection of local conditions as follows: 
Le = H¥(Ge/Ie,(Ad° p)l*) fore ¢ 5, 2 #0, 
Le = H}(Gz,Ad° p) for £€u,£4p, 
Ly will be specified later. 
In other words, if we think in terms of infinitesimal deformations, we allow 


as little ramification as possible at the primes # p outside /, the ramifica- 
tion at those places being due to ramification in p. At the primes @# pin } 
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we allow arbitrary ramification. At p we want to control what happens a 
little more carefully, depending on properties of p. 

In the formula of Theorem 2, we need to evaluate, or at least estimate, 
the factors #Le/#H° (Ge, Ad?° p) corresponding to the various primes. 

® The factors for the primes 2 ¢ © with 2p are all 1 by Lemma 1. 

e The factor for the infinite prime is easy. Since Gp has order 2 

and Ad° p has odd order, H!(Gp,Ad° p) = 0. Therefore Loo is a 

subgroup of the trivial group, hence trivial. We may assume that 


p(c) = € Si): Since p(c)Ap(c)~' = A is equivalent to A being 
diagonal, we see that H°(Gr, Ad® p) has order #R. Therefore the 
factor for co is 1/#R. 

@ Let 2€ 4, 24 p, cw. Then, as in the proof of Proposition 5, we 
have ; ; 

HH’ (Ge, Ad 2 

HH (Ge Ad 0) _ y(cy, (Ad? p)*). 

##H°(Ge, Ad’ p) 


§7. CONDITIONS AT p 
Ordinary representations. Suppose p|g, has the form (for some choice 


of basis) es th ): where w, and we are unramified characters (with 
2 


values in R*), and ¢€ is now the cyclotomic character (not the infinitesimal 
element from above) giving the action of G, on the p-power roots of unity. 
Let W® be the additive subgroup of Ad° p given by matrices of the form 


(0 9) 


Lemma 2. G, acts on W® by multiplication by ye/we. 


Proof. 
a4 * ) ( ) Sh + a = ¢ aed) 
0 we 0 0 0 Yo 0 0 ; 
Lemma 3. #H°(Gp, (W°)*) = #R/(§(Frob,) — 1)R. 


Proof. An element of (W°)* is a group homomorphism ¢: R — pp» (for 
some sufficiently large n), and ¢ is fixed by G, if and only if d(gr) = gd(r) 
for allg € Gp andr € R. By Lemma 2, this means o(S£r) = ed(r). 
Note that ¢ takes values in the image of Z, in R, which is the same as the 
image of Z in R. Therefore we can regard ¢ as an integer that is also a 
unit in R, and consequently obtain o(2r) = ¢(r). Since y and we are 
unramified, it suffices to check this for g = Froby, so we let a = #2 (Frobp). 
We need ¢ to satisfy 6((@ — 1)r) = 0 for all r. This says that ¢ is a 
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group homomorphism from R/(@ —1)R to pp»Z. The number of such 
homomorphisms is #R/(a—1)R. O 


We now look at two choices for Lp. 


Choice 1. Lp = Ker(H'(Gp, Ad” p) + H¥(Ip, Ad? p/W°)) 

In terms of infinitesimal deformations p’, this requires p’|;, always to be 

€ * 

0 1 

the case of an elliptic curve with good ordinary reduction at p. 
Consider the diagram 


equivalent to the form . This case will be used, for example, in 


H (Gp, Ad’ p) 


0— H'(Gp/Ip, (Ad° p/W°)?») — H\(Gp, Ad p/W®) 
==; H (Ip, Ad? p/W°)Sr/", 

Then Ly = Ker(res o u) and H1(Gp, Ad° p)/Lp  Im(res ou). 

From the exact sequence, 

#Im(res ou) > #Imu/#H" (p/p, (Ad° p/W°)”) 
= #Imu/#H(Gp, Ad? o/W°), 

the last equality following from Lemma 1. The exact sequence (with 
H'(X) = H"(Gp, X)) 

0 — H°(W°) — H°(Ad® p) = H®(Ad® p/W*) 

— H'(W°) > H'(Ad° p) = Imu > 0 
yields #Imu as the alternating product of the Oncer of the other terms, 
and we obtain 

#L, #H'(Gp, Ad? p) 
#H(Gp,Ad° p)  #H(Gp, Ad” p) # Im(resou) 
< #H (Gp, Ad? p)#H° (Gp, Ad’ p/W°) 
#H(Gp, Ad® p) # imu 
_ #H"(Gp, W°) 
-#H°(Gy, W°) 
= #R-#H (Gp, (W°)*). 


The last. equality follows from Proposition 3. Combining this with Lemma 
3, we obtain 


EET LOL LT OL ED ELE LT EGET SL LE PE I ITE I Fo LY TIS EE PT Ae LMT Te ET a ae, 
De 


ne #L, 

3 See a RG = (Ftobp) —1)R|. 
ss #H(Gp, Ad” p) 

ao 


As 


os 


Oar 
st ieee a 
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Choice 2. Ly = Ker(H1(Gp, Ad° p) > H(G,, Ad® p/W°)) 

This is used when working with an elliptic curve that has bad multiplicative 
reduction at p. It is similar to the previous case, except that it specifies 
what happens on all of Gp. Actually, in this case (“ordinary but not flat” 
[DDT], or “strict” [Wi]) we could use the same Ly as before, by a result of 
Diamond [Wi, Proposition 1.1], but the present choice is more convenient 
for our calculations. By the calculations just completed, but with the new 
choice of Ly, we have H1(Gp, Ad° p)/Lp ~ Imu and 


#lp — _ #R-#H(Gp, (W°)*) 
H#HYG, Ad p) #HUG,, Ad? p/W?) 


In the case where this will be applied, we will have 
Wi = Wa, 


so #H°(G,, (W°)*) =#R by Lemma 3. Also, we will have a matrix 
_f(wme y . x 
pla) = (4° 2) with ve R 
in the image of p|q,. Since 
CF y (; *# ce y “= (oe 7 ) 
0 we) ke -a)0 ve) “Use aa): 


it follows that an element of Ad° p/W°® fixed by G, is represented by a 
diagonal matrix. Therefore #H°(G,,Ad°p/W°) = #R. Putting things 
together, we obtain 


#Lp 


FHO(G,, Ad p) 


Flat representations. This is a more technical situation that must be 
used in the case of an elliptic curve with good supersingular reduction. 
Let Lp = H}i(Gp, Ad° p) be those cohomology classes in H!(Gp, Ad” p) 
representing extensions 0 — M — E — M — 0 in the category of R[G,]|- 
modules attached to finite flat group schemes over Zp. We also assume 
that R = O/A", where O is the ring of integers in a finite extension of Q, 
and A generates the maximal ideal. The theory of Fontaine-Lafaille implies 


that 
#L 


#EP(G,,A@p) 
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§8. PRoor Or THEOREM 2 


We first address a technical point. Let © be a finite set of primes and 
let Qs be the maximal extension of Q unramified at the primes not in 
y. Let X be a module for Gy =Gal(Qy/Q). Then X is also a mod- 
ule for Gg that is unramified outside ©. Some papers, for example [Wij, 
consider H'(Gy, X), while others, for example [DDT], consider the classes 
of H'(Gg, X) unramified outside ©. Fortunately, the two groups are iso- 
morphic. In the following, we will find it more convenient to work with 


H} (Gy, X). 


Proposition 6. H!(Gy, X) ~ Ker(H"(Go, X) ~T] H (Ie, X)). 

eS. 
Proof. The following diagram commutes (the top row is inflation-restric- 
tion). 


0 — H(Gy,X) — H\Go,X) — H(Gal(@/Qs), X) 


| [- 


][ Hom(Ze, X)  Hom(Gal(Q/Qz), X). 
£¢s; 


The map @¢ is injective since a homomorphism that is 0 on J, for all 2¢ = 
must vanish on the smallest normal subgroup generated by all such Ig, 
which is Gal(Q/Qs). The result follows easily. O 


Proposition 7. If X is finite then H'(Gy, X) is finite. 


Proof. Choose an open normal subgroup H of Gy; such that H acts trivially 
on X. Let K be the fixed field of H. The group H'(H, X) = Hom(H, X) is 
finite since it classifies Galois extensions of K, unramified outside ©, with 
Galois group isomorphic to a subgroup of X, and there are only finitely 
many such extensions by a theorem of Hermite-Minkowski. Since Gy/H is 
finite, the group H!(Gs/H, X) is finite by its definition. The result now 
follows from the inflation-restriction sequence. O 


Corollary. H}(Q, X) is finite. 
Proof. The group is isomorphic to a subgroup of H! (Gy, X). O 


Let X be a finite module for Gg. Fix a set © containing oo, all the prime 
divisors of ##X, and all primes such that I, does not act trivially on X. 
There exists an open subgroup that acts trivially on X. This subgroup 
corresponds to some finite extension K/Q, and the inertia group of any 
prime not ramifying in K acts trivially on X. Therefore we can take & to 
be finite. Let Uy be the set of finite primes in L. For an integer r = 0, 1, 2, 
let 

a, : H" (Gy, X) — H"(Gp,X)x || A"(Ge,X) 
leds 
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be induced by the restriction maps, where H "(Gr,X) is the modified 
Tate cohomology group (when r > 0, let HT = H™). By Theorem 1, 
A" (Gr, X)x][ H" (Ge, X) is the dual of H?-" (Gr, X*) x [| H?-" (Gz, X*), 
so we may dualize the map 


H?-" (Gy, X*) > H? (Gp, X*) x [] H?-"(Ge, X*) 
led; 


to obtain 


6, : H"(Gr,X) x |] H"(Ge,X) — H?-" (Gy, X*)’, 
leds 


where AY = Hom(A,Q/Z) is the dual of an abelian group A. Let 
Ker’ (Gy, X) = Kera,. 
Proposition 8. There is a non-degenerate canonical pairing 
Ker?(Gs, X) x Ker’ (Gs, X*) — Q/Z. 
Proof. The pairing can be defined as follows. Let f € Ker? and g € Ker’. 
For @ € X, we can write resg f = dd¢ and resg g = bug, where dg: Ge — X, 
we € X*, and 6 is the coboundary map of the appropriate dimension. It 


can be shown that the cup product fUg =0 € H3(Gy, QS), so fUg = 6h 
for an appropriate h. Then 


(f Ude) —h = (eg) —h+ ($e U pe), 
hence (f U we) — h and (dg Ug) — hh represent the same class 
Lee H? (G,, QF) ~ Q/Z, 


and z¢ is independent of the choices involved. Define 


<f,g>= 5/22 € Q/Z. 


led 


The proof of the non-degeneracy is much more difficult. See [Mi]. O 


Proposition 9. ag is injective, Bo is surjective, and for r = 0,1,2, we 
have Ima, = Ker £,. 


Proof. For a proof, see [Mil. 


This can all be summarized in the following. 


H 


Sn a aa i ee a a a a lt i aA ia i aia ae ie Bilas he i ia ala OE ee in| 
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Proposition 10 (Poitou-Tate). The following nine-term sequence is ex- 
act: 


0 — H°(Gy, X) *% H°(Gp,X)x |] H°(Ge, X) ©, H?(Gy, X*)Y 


leds 
— H}(Gy, X) +5 1, T] 2(Ge, x) & =, H'(Gy, X*)’ 
lex 
— H?(Gy, X) “> 2, [] #2 (Ge, X) , H°(Gs, X*)¥ 0, 
lex 


where the unlabeled arrows are maps defined by the non-degeneracy of the 
pairing in Proposition 8. 


It is also possible to work with infinite sets ©, but then some restrictions 
need to be made on the direct products involved. 

We can now prove Theorem 2. The definition of the Selmer group yields 
the exact sequence 


0 H}.(Q, X*) > H'(Gy, X*) > |] (Ge, X*)/Le- 
> 


Dualizing (i.e., Hom(—,Q/Z)) and using the pairing of Theorem 1 yields 
0 — Hz.(Q, X*)” — H (Gy, X*)” — | Le. 
Splicing this into the nine-term sequence yields 


0 H°(Gy, X) “> H°(Ga, X) x |] H(G,,X) © H?(Gy, X*)Y 


led; 
2(Q, X)  [] Le 9 H(Gz, X*)” > H}.(Q, X*)” 0. 
fed 
Therefore 
#H;.(Q, X*) 
_ HO (Gn, XOHEGE, XN HOF)X pp __ tle 
7 #H! (Gy, X*) #H9(Gy, X)’ 


lex 


where we have used the fact for 2 = co that 
H° (Gr, X) = H°(Gr, X)/(1 + c).X. 


We now need the following formula for what may be regarded as a global 
Euler characteristic. 
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Proposition 11. Let X be finite. The groups H"(Gy, X), r=0,1,2, are 
finite, and 


#H° (Gy, X) #H?(Gu,X)  #H°(Gr, X) 
#H(Gy, X) 7 HX ; 


Proof. For a proof, see [Mi, p. 82]. 


Since H?(Gy, X*) is finite, it has the same order as its dual. Also, 
H°(Gy,X) = X= = Xe = H°(Gg,X). Therefore the proposition, 
applied to X*, reduces the proof to the following. 


Lemma 4. #(1+¢)X -#H°(Gp, X*) = #X*. 


Proof. The (non-degenerate) pairing X x X* — uy satisfies (cr, cr*) = 
e(x, 2*) = (xz, 2*)~}, from which it follows that ((1+c)z, 2*) = (x, (1—c)z*). 
Therefore x* is fixed by c = > (l-—c)z* =0 =< (2,(1-c)s*) = 
O for allz <= ((1+c)z,2*) = 0 for all z. Therefore H°(Gp, X*) is the 
exact annihilator of (1+ c)X, hence is dual to X/(1+c)X. The result 
follows easily. O 
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FINITE FLAT GROUP SCHEMES 


JOHN TATE 


INFRODUCTION 


The kernel of an isogeny of degree n of abelian varieties of dimension g 
is, at a place of good reduction, a finite flat group scheme of order n?9 over 
the local ring of the place. That is perhaps the main reason for studying 
finite flat group schemes, although they are interesting enough in their own 
right, and it is in any case the reason a discussion of them appears in this 
volume. For that reason also, the commutative case is the most important 
for us, and it is in that case that the theory is most interesting and highly 
developed by far. Nevertheless we do not assume commutativity at the 
beginning and develop the basics of the theory without that assumption. 

We use the language of schemes, but without much loss of generality we 
can, and mostly do, restrict to the affine case, because a finite morphism of 
schemes is affine. Thus only very elementary scheme theory is needed — 
not much more than the equivalence between the category of affine schemes 
and the category of rings with arrows reversed. By ring or algebra in this 
paper we mean one which is commutative with unzty, unless mention is 
made to the contrary. If R is a noetherian ring, a finite flat group scheme 
G over R (that is, over Spec(R)) is of the form G = Spec(A), where A 
is a commutative Hopf algebra over R which is locally free of finite rank 
as R-module. In essence, our topic is the theory of such Hopf algebras. 
Although we treat the case of a general noetherian base ring as far as 
possible, the reader will not lose much by restricting to the case in which R 
is a discrete valuation ring or a field, in which case even the commutative 
algebra involved is quite elementary. 

Beyond the very general properties of group schemes, the only more 
special results we treat (in §4) are some of Raynaud’s, over valuation rings 
of mixed characteristic. For the more refined theory in characteristic p, we 
refer the reader to [deJ] 

In dealing with group schemes it is extremely convenient to use some 
basic categorical concepts, in particular, the fact that attaching to an object 
G in a category C the contravariant set functor represented by G embeds C 
as a full subcategory of the category C of all such functors. It is often easier 
to describe the functor represented by a group scheme than to describe the 
group scheme or Hopf algebra itself. 
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§1. GROUP OBJECTS IN A CATEGORY 


The subject of this section is very clearly explained, with a few more 
details in [B-L-R, §4.1]. Other sources are, among many, [SGA3, Exp.]] 
and [SS]. Let C be a category with finite products and in particular a final 
object, the empty product, which we will denote by S in anticipation of the 
case in which C = (Sch/S) is the category of schemes over a base scheme 
S. Let G be an object of Candm:GxG-— G a “law of composition” on 
G. This m induces, for every T in C, a law of composition on the set 


G(T) := Hom, (T, G) 


in an obvious way, because by definition of the product G x G we have 
(G x G)(T) = G(T) x G(T). Explicitly, writing the induced law on G(T) 
multiplicatively, we have gig2 = mo (g1,92), where (g1,g2):T ~GxG 
is the unique arrow such that pr;°(g1,92) = g; fori = 1,2. (Here pr, 
and pr. are the two projections G x G — G.) A morphism f : T’ > T 
induces a map f* : G(T) — G(T”) by f*(g) := gof, and this map preserves 
the law of composition in the sense that f*(gige) = f*(g1)f*(g2), because 
(91,92)9 f = (91° f,g2°f). In other words the association T ++ G(T) isa 
contravariant functor from C to the category of magmas (a magma is a set 
with a law of composition). 
The following four facts are easily checked and are left to the reader. 


(1.1) Associativity. The magma G(T) is associative for every T if and 
only if the equality (pr, pr.) prs = pr, (prz pr3) holds in G(G x Gx G), ie., 
if and only if the following diagram is commutative 


id xm 


GxGxG =Gx(G x G)——4E xe 
! 

a) (Gx G)xG “ 
mx id 

GxG Sa 


(1.2) Unit elements. The magmas G(T) have two-sided unit elements 
er (necessarily unique), and these units are preserved by the morphisms 
f*: G(L) — G(2”"), if and only if there is a point e € G(S) (recall that S 
is the final object in C) such that the equality m*(ce) - id = id = id-*(e) 
holds in G(G), where 7 = mq is the unique arrow G — S, that is, if and 
only if each triangle in the following diagram commutes 


id Xe 


b) ein — |» 
exid 
GG = eae 


When that is the case, er := m7(e) is the unit in G(T) for each T. 
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(1.3) Inverses. Suppose the magmas G(T’) have two sided units er pre- 
served by the f*’s. Then the necessary and sufficient condition that ev- 
ery element g € G(T) have a left inverse for every T is that the element 
id = idg € G(G) have a left inverse in G(G), ie., that there exist an ele- 
ment inv € G(G) such that inv -idg = €@, or in other words such that the 
diagram 


GxG ies ska ar GxG 


c) a] [n 


is commutative. Then (inv og) - g = er for every g € G(T), any T. 


(1.4) Commutativity. The magmas G(T) are commutative if and only 
if the equality pr, prg = pr, pr, holds in G(G x G), i-e., if and only if the 
diagram 
xe Ne Gre 
d) ms, fim” 
G 


commutes, where 7 is the automorphism interchanging the factors on the 
product. 


(1.5) Definition. A group object in C, or a C-group is an object G in C 
together with a morphism m: G x G — G such that the induced law of 
composition G(T) x G(f) — G(T) makes G(T) a group for every TinC. A 
C-group G is commutative if the group G(T) is commutative for every T. A 
homomorphism. of C-groups G — G’ is a morphism G — G’ in the category 
C such that, for every object T in C, the induced map G(T) — G'(T) given 
by g++ yog is a homomorphism of groups. 

From the above discussion it is clear that a pair (G,m) is a group object 
if and only if the diagram a) is commutative and there exist morphisms 
e: S — G and inv : G — G such that diagrams b) and c) commute. Of 
course € and inv are unique if they exist. And (G,m) is a commutative 
group object if and only if in addition diagram d) commutes. 

Suppose (G,m) and (G’,m’) are two group objects in C. In order that 
a morphism y : G — G’ be a C-group homomorphism it is necessary and 
sufficient that the equality »,(pr, pr2) = ¢.(pr,)~.(pr.) hold in G’(GxG), 
i.e., that the diagram 


Ge Gh aees Clix Cr 


m| | m 


a Ce 


be commutative. 
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(1.6) Group object = Group functor. Suppose we are given an object 
G in C, and, instead of a “morphic law of combination” m:GxG-— G, 
we are given for each T in C a group structure on G(T) such that for each 
f:T' —T the induced map f* : G(T) — G(Z”) is a group homomorphism. 
Then there is a unique m : G x G — G which induces the given group 
structure on G(T’) for each T. The unicity of m follows from the fact that 
anm:GxG— G can be recovered from the law of composition it induces 
on the set G(G x G), as the product for that law of the two projections; 
m = pf, pry. On the other hand, it is easy to check that that choice of 
m. does induces the given law of combination in G(T) for each T. The 
point of this paragraph is that a group object in C is the same thing as a 
contravariant functor from C to the category (Gr) of groups such that-the 
underlying functor from C to (Sets) is representable, i.e., isomorphic to a 
functor of the form T — G(T) for some object G of C. 

Similarly, if G and G’ are C-groups, then to give a homomorphism of C- 
groups y : G — G’ is the “same” as to give a homomorphism of the functors 
they represent, that is, to give for each T in C a group homomorphism 


gr: G(T) > G(T) 


such that f* opp = pro f* for every morphism f : T’ — T of objects 
in C. One recovers y € Home(G, G’) = G’(G) as the image of the identity 
in Home(G, G) = G(G) under the map yg : G(G) > G’(G). 


(1.7) Kernels. A simple example of the use of (1.6) is the construction 
of kernels. Let y : G — G’ be a homomorphism of group objects in C. Let 
us define a kernel of y to be a homomorphism of group objectsa: HG 
such that, for every T in C, the sequence 


0 — A(T) 2 G(r) £4 G(T) 


is exact. Such an H exists if the fiber product indicated by the following 
diagram exists in C: 


Prag 


H= GxsS ——— 
G’ 
seal ly 
G ae 


Then the lefthand vertical arrow a = pr, identifies the set H(T) with 
Ker(G(T) — G’(T)), because S(T) = {ar} is a singleton for each T, and 
comp = er is the unit in G’(T). This identification makes H(T’) a group 
in a functorial way so that pr, : H — Gis a kernel for yp. 

Thus if the category C has fiber products, then Ker y exists for every ¢. 
We leave to the reader to check that it is unique up to a unique isomorphism 
and that, in the notation above, if H’ is any group object in C, then to 
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give a homomorphism H’ — H is the “same” as to give a homomorphism 
H' —> G whose composition with y is the trivial homomorphism 


i 
Ws38§=3¢, 
ie., the sequence 


0 —> Hom(H’, H) —> Hom(H’, G) —> Hom(H’, G’) 


is exact. 


(1.8) Cokernels. The question of coset spaces and cokernels cannot be 
treated in the same simple-minded way. Even if we assume that py : G — G’ 
is an injective homomorphism of commutative group objects, the functor 
T ++ Coker(yr) = G’(T)/pG(T) is rarely representable. The situation 
is analogous to the case of sheaves of abelian groups, in which the naive 
cokernel is only a presheaf in general, not a sheaf. In the commutative 
case, one can characterize the desired cokernel as a C-group H with a 
homomorphism G’ — H such that, for every C-group H’, the sequence 


0 —+ Hom(H, H’) —+ Hom(G’, H’) —+ Hom(G, H’) 


is exact. But to show the existence of such an H and to prove it has other 
desirable properties is often a serious problem. In case C = (Sch/S), the 
category of schemes over a base scheme S, and G is a finite flat closed 
subgroup scheme of G’, then the problem was solved by Grothendieck; we 
discuss the matter in §3. 


§2. GROUP SCHEMES. EXAMPLES 


We now specialize to the case of the category (Sch/S) of schemes over 
a base scheme S. 


(2.1) Definition. An S-group scheme, or simply S-group, is a group ob- 
ject in (Sch/S). 


We will denote the category of S-group schemes by (Gr/S). 


(2.2) Hopf Algebras. For us, S will usually be affine, say S = Spec(R), 
and we will often replace S by R in the notation and terminology, writing 
(Sch/R) and R-group scheme, etc. Let G = Spec A be an affine R-scheme. 
In view of the arrow-reversing equivalence between the category of com- 
mutative R-algebras and the category of affine R-schemes, to make G into 
an R-group scheme is to give R-algebra homomorphisms 


m:A—+A@rA E:A—+R inv: A—A, 


126 J. TATE 


corresponding to the morphisms m, €, inv discussed in §1, which make com- 
mutative the diagrams, let’s call them 4), b), ¢), obtained from diagrams 
a), b), c) by reversing arrows, replacing S by R, G by A, x by @p, and 
putting ~ on the labels of the arrows, with A:A@pA-— A induced by 
the multiplication in the ring A. One calls m the comultiplication, é the 
augmentation, or counit, and inv the antipode. A commutative R-algebra 
A with unit which is furnished with homomorphisms m, €é, inv satisfying 
the stated commutative diagram conditions is called a commutative Hopf 
algebra. Thus the category of affine R-group schemes is antiequivalent to 
the category of commutative Hopf algebras over R, with the obvious def- 
inition of homomorphism of Hopf algebras. Commutative Hopf algebras, 
especially over fields, have been extensively studied for a long time in con- 
nection with the theory of affine algebraic groups. Cocommutative Hopf 
algebras have been around a long time also — examples are group algebras, 
enveloping algebras of Lie algebras and the one originally studied by Hopf 
— the homology of a manifold M with a product operation Mx M— M. 
Some general references for these types of Hopf algebras are [A], [C-S], [M- 
M], [Sw] and [W]. But it’s only in recent times that important Hopf algebras 
which are neither commutative nor cocommutative have been discovered, 
usually as deformations of commutative ones, and are being studied se- 
tiously ([Dr], [SS-SS]), under the name “quantum groups.” But in this 
paper all the Hopf algebras we encounter will be either commutative or 
cocommutative, mostly the former. 


(2.3) The Augmentation Ideal. Let G = Spec(A) be an affine R-group 
scheme. The kernel of the augmentation map € is an ideal J = Ig in 
A called the augmentation ideal. As R-module we have A= R-1690T7, 
direct sum, because the canonical map R — A splits the exact sequence 
0-I1-A>R-0. This AQ@A=ROE(IQ1)O(1@NO(I@l). An 
important fact about the comultiplication is that 


m(f)-f@1-1@fEl@l, for fel, 


as one sees by applying the maps € @id and id @€ whose kernels J ® A and 
A@ JI have intersection I @ I. 


(2.4) First examples; G4 and G,,. Let G = Spec(A) be an affine 
scheme over a ring R. To give G an R-group structure it suffices, as ex- 
plained in §1, to give a group structure on a functor 


T — G(T) = Homysen;ry(T, G) 


from R-schemes T to sets. One does not have to construct m and show the 
existence of € and inv such that diagrams a), b), c) commute; one recovers 
m:GxG —G as the composition of the two projections pr, and pr, in the 
group G(G x G) and similarly e and inv. Since G is affine, we can restrict 
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T to be affine if we wish, and will usually do so. If T = Spec B, then we 
write G(B) := G(T) = Homp.aig(A, B). Thus, to make G an R-group is 
to make B+>+ Homp.aig(A, B) a functor from R-algebras to groups. Then 
the comultiplication m: A — A @pr A is obtained as the composition in 
the group Homp.aig(A, A @r A) of the two maps 


pf,:at+a@l and pry:a>1@a. 
Here are some standard examples. 


The additive group G,. Let G, = Spec(R[u]), u an indeterminate. For 
each commutative R-algebra B, the map f + f(u) identifies 


Homp.aig (R[u] ‘ B) 


with B itself. The additive group structure on B for varying B makes G, 
an R-group, with comultiplication m determined by m(u) = u@1+1@u. 
Not surprisingly, one finds €(u) = 0, and inv(u) = —u. More generally, 
if M is any R-module, and A = Symp(M) its symmetric algebra over R, 
then Homp.aig(A, B) = HomrR-moa(M, B) is a commutative group under 
addition for each R-algebra B. Thus Spec(A) is a commutative R-group. 
Taking for M the free R-module Ru on one generator u, we recover G,. 


The multiplicative group G,,. Let G,, = Spec(R[u,u~']). For each 
R-algebra B, the map f +> f(u) identifies Homr.a,(R[u,u—'],B) with 
the multiplicative group B* of invertible elements of B. Thus G,, is an 
R-group, with 


Mu) =(u@1)(1@u)=u@u, &(u)=1, and inv(u)=u"!. 


This example has at least two important generalizations which we discuss 
in the next paragraphs. 


(2.5) The general linear group GL,. Let n be an integer > 0, and 
let U = (uj) and V = (u;) be two n x n matrices with independent 
indeterminate entries. In the polynomial ring of 2n? variables 


Riu, v| => Riu, v2, ape gers 


let J be the ideal generated by the n? entries of the matrix UV —I, and let 
A= Rlu,v|/J. Then f + f(U) gives a bijection between Homp.aig(A, B) 
and the group GL,,(B) := (M,(B))* of invertible nxn matrices with entries 
in B, because a right inverse of a square matrix is unique if it exists, and 
is a left inverse as well. Thus Spec(A) is an R-group scheme, denoted by 
GL,,. For n = 1, we recover G,, = GL. 

A linear representation of degree n of an R-group scheme G is a homo- 
morphism of R-group schemes G — GL,. To give such a homomorphism 
is the same as to give an invertible n x n matrix (a;;) of sections of Og 
such that m(aix) = ) 0%) dij @ Ajx- 
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Exercise. Generalize GL, in the following way. Instead of M,,(R), take 
D to be any (not necessarily commutative) R-algebra which is free of finite 
rank as R-module. Show that there is an affine R-group scheme, call it D*, 
such that D*(B) = (D @r B)* for every commutative R-algebra B. 


(2.6) Diagonalizable group schemes. If X is an ordinary commutative 
group, we denote by R[X] = @r¢x Rz the group algebra of X over R. The 
association X ++ D(X) := Spec(R[X]) is a contravariant functor from 
the category (Ab) of abelian groups to the category of R-schemes. In 
fact, it is naturally a functor to commutative R-group schemes because the 
identifications 


(D(X))(B) = Homp.aig( R{[X], B) =Homyapy(X, B*) 


gives us a commutative group structure on the functor Br+ D(X)(B) for 
each X. Hence D(X) is a commutative R-group scheme. On the basis 
elements z € X of R[X] we have 


M(x) =2@z, e(z)=1, and inv(z)=27} 


as in easily checked. A special case is G,, = Spec(R[u,u~*]) = D(Z), 
because R[u, u~*] is the group algebra of the infinite cyclic group generated 
by u. 

If X is a finite abelian group of order n, then R[X] is a free R-module 
of rank n, and D(X) is finite and flat over R and is therefore an example 
of the type of R-group scheme which is our main concern in this paper. 

Suppose 

xX —Y—>Z—0 


is an exact sequence of abelian groups. Then 
0 —+ Hom(Z, B*) —+ Hom(Y, B*) — Hom(X, B*) 


is exact for every B, and consequently the corresponding sequence of group 
schemes 


0— D(Z) — D(Y) — D(X) 


is exact (meaning that D(Z) — D(Y) is a kernel of D(Y) — D(X) in the 
sense of §1). 


(2.7) The group schemes p. Let n be an integer > 1. The R-group 
scheme D(Z/nZ) is denoted by pz, and is called the scheme of n-th roots 
of unity over R. The dual of the exact sequence 


Z > Z — Z/nZ — 0 


S43 = Go. Ge 
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Thus 2, is the kernel of raising to the n-th power in G,,. For each R-algebra 
B we have p,(B) = {be B| 6b" = 1}. The arrow p, — G,, corresponds 
to the algebra map R[u, u~*] — R[u,u~*]/(u” — 1) and identifies u, with 
a closed finite flat subgroup scheme of G,,, of order n, in the sense of §3. 

Suppose the abelian group X is finitely generated. Then X is isomor- 
phic to a finite product of cyclic groups, hence D(X) to a finite product of 
copies of G,, and p,,’s, for various n, and therefore to a closed subgroup 
of a product G7, of copies of G,,. Viewing G7, as the closed subgroup 
of GL, consisting of diagonal matrices we obtain a faithful linear repre- 
sentation of D(X) which identifies D(X) with a diagonal closed subgroup 
scheme of GL,. That is the reason the group schemes D(X) are called 
~“diagonalizable” 


(2.8) Base change. Let U be an S-scheme. If T is an S-scheme we 
sometimes write Ur := U xg T for the “base change from S to T of U.” 
Every T-scheme V is an S-scheme in a natural way, and Ur(V) = Us(V). 
Thus, if G is an S-group scheme, then the functor V++ G7(V) is a group 
functor on (Sch/T), and hence Gr is a T-group scheme. Every scheme S 
is uniquely a (Spec Z)-scheme, and all our examples so far are the canonical 
base changes from Z to R of group schemes over Z. That is why the groups 
Ga(B), Gm(B), etc., depend only on B as a ring; i-e., as a Z-algebra, and 
not on B as an R-algebra. From now on we will let G,, Gm, GLn, D(X) 
stand for the versions over Z and will write (G,)s := G, x S, etc. for their 
base change to a scheme S. 

For an S-scheme T’, if we denote by Br = I(T, Or) the ring of sections 
of the structure sheaf of 7, we have 


(G.)s(T)=G.(T) = Br (additive group) 
(Gm)s(T) =Gm(T) = By (multiplicative group) 
(GLz)s(T) = GLn(T) = GL,a(Br) = M,,(Br)* 
D(X)s(T) = D(X)(T) = Homyay)(X, BE) = Homayy(X, Gm(T)) . 


(2.9) Characters and group-like elements. Let G be an S-group 
scheme. A character of G is a homomorphism of S-group schemes 


x: G- (Gm)s 


or, what is the same, a non-vanishing section of the structure sheaf Og of 
G for which the equality m*x = (pri x)(pr3 x) holds on G xg G. These 
characters form a subgroup 


Homer/s)(G,(Gm)s) of Homysenss)(G, (Gm)s) = Gm(G). 


If S = Spec(R) and G = Spec(A) are affine, then a character x of G is an 
invertible element of A such that my = (x @1)(1@x) =x ®@x. Such an 
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element of a Hopf algebra A is called group-like. The group-like elements 
of A form a subgroup of A*, the group of characters of Spec(A) defined 
over R. 
The functor 
T +> Homer/t)(Gr,(Gm)r) 


is a contravariant functor from (Sch/S) to (Ab). If it is representable, the 
representing commutative S-group scheme is called the character group 
scheme of G. 

If G’ is the character group scheme of G, then for each T in (Sch/S) we 
have a pairing 


(*) G'(T) x G(T) — Gr(T) 
given by the map 
G'(T) = Homyer/t) (Gr, (Gm)r) — Homer) (G(T), Gn(T)) . 


The pairings («) are compatible with base change T’ — T. Conversely, 
given S-group schemes G and G’, a collection of pairings (*) compatible 
with base change determines a homomorphism of G’(T’) into the group 
of homomorphisms of the functor Gr into the functor (Gm)r, hence a 
homomorphism of G’(T) into Homer/T)(Gr, (Gm)r) for each T. If these 
homomorphisms are isomorphisms, then the pairings («) identify G’ with 
the character group scheme of G. 


(2.10) The duality between Xs and D(X)gs. Let X be a set, S a 
scheme. The constant S-scheme Xg attached to X is by definition the 
disjoint union Xs = [J,¢x Sz of copies S; of S indexed by X. Then for 
an S-scheme T,, an element f € Xs(T), that is, a morphism of S-schemes 
f :T — Xz, is determined by the collection of subsets U, = f~!(S,) of 
T. These subsets are open, disjoint, and cover T. The restriction of f to 
U, is the unique morphism U, — S, = S. Such a covering determines and 
is determined by the locally constant X-valued function y on T taking the 
value z on U, for each x € X. In this way, Xg(T) is identified with the set 
of locally constant functions y : T — X. If T is non-empty and connected, 
then X(T) — 
Since Xs = [],¢x Sz we have 


T'(Xs,0xs) = |] T(Sz,0s,) 
LEX 


and since S, = S for all z, this is simply the ring of functions on X with 
values in I'(S,Og). The scheme Xz is affine if and only if S = Spec(R) 
is affine and X is finite (or R = (0)), in which case X = Spec(A), where 
A= Map(X, R) is the ring of R-valued functions on X. 
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Suppose now X is a group. Then X(T) is a group under value-wise 
composition of the locally constant functions y : T ~ X, so Xg is a 
group scheme, the constant S-group scheme determined by X. It is easy 
to check that a section x of Ox, is a character of Xg if and only if, when 
viewed as above as a function on X with values in I(S,Os), x is a group 
homomorphism X — IS, Os)*. 

Suppose X is a commutative group. Then such a homomorphism yx is 
a point of D(X) with values in S, so the group of characters of Xg is 
D(X)(S). The same is true after base change T — S. Hence D(X)g is the 
character group scheme of Xg (hence the notation: D(X) = dual of X). 

Slightly less tautological is the fact that Xg is the character group 
scheme of D(X) . The pairing 


D(X)(T) x X(T) — I(T, Or)* = Gn(T) 


takes x xy into “yoy,” by which we mean the section of O; which coincides 
with the section x(x) on the set y~'(r) = U, for each s € X. This pairing 
gives a homomorphism 


X(T) — Homer/r)(D(X)r, (Gm)r) 


for each T. To show it is an isomorphism for all T it is enough to show it 
is for T affine, because each side, as functor of T, is a sheaf in the Zariski 
topology. Suppose T’ = Spec(B), so D(X)r = Spec(B[X]). The character 
of D(X)r corresponding to y € X(T) is the section of Op(x), which is z 
on the open set ~~! (x), that is, is the group-like element 5° e,,.2 € BLX], 
where €,,, is the idempotent in B which is the “characteristic function” of 
U, = y~!(z). On the other hand, it is easy to check that every group-like 
element of the Hopf-algebra BLX] is of the form }> era, where {e,,1 € X} 
is a family of orthogonal idempotents in B, indexed by X, whose sum 
is 1. For more details, see Grothendieck’s discussion in the first sections of 
[SGA3, I]. 


(2.11) Derivations. Suppose G = Spec(A) is an affine R-group scheme. 
Let I be the augmentation ideal and m: A — A@A the comultiplication, 
as usual. Let r: A= R1@J — I/I* be the R-linear map killing R1 and 
projecting I. 
Proposition. Let M be an A-module and :M®@A—-— M the map 
giving the action of A on M. The map \ + Wo ((Ao 7) @ id) om is 
an isomorphism from Homr-moay(I/I?, M) to Derr(A,M), the module of 
R-linear derivations A — M. 
Corollary. The map (@id)om: A— (I/I?)@rA = yp 1s a universal 
R-linear derivation for A. 

We sketch a proof. For more details see for example [W], 11.3. The 


corollary follows from the proposition because the map A+ Wo (A @ id) is 
a bijection from Homr moa) (I/I?, M) to Hom 4-moay((I/I?) @r A, M). 
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Let B be an R-algebra and N a B-module. Make B @ N a B-algebra 
with N? = (0) and let 7: BON — B be the projection killing N. The 
induced group homomorphism j, : G(B ® N) — G(B) is a projection to 
the subgroup G(B). Hence G(B © N) = H = G(B), semidirect product, 
where H = Ker(j,). For x € G(B), let Nz denote N viewed as A-module 
via z: A — B. The coset Hz is the set of all homomorphisms A —- B® N 
lifting sc: A— B. A standard computation shows that these are the maps 
of the form + @ 6 where 6: A — N, is an R-linear derivation. 

Let eg: A— R— B be the identity in G(B). For 6 € Der(A, Ne) 
and x € G(B) define 6, by (€g ®6)z = (x @6,). Consideration of the group 
G(B ® N) shows that the map 6 + 6, is a bijection from Derr(A, Ne, ) 
to Derr(A,N,), and working out the -group law explicitly one -finds the 
formula 6, = wo(é6@z)om, where): N@B-— N is the map giving 
the action of B on N. (Exercise: Show that the map 6 ¢«g @6 isa 
group isomorphism from Derr(A, N-,) to H.) On the other hand, from 
the definitions one checks that the map A + Ao 7 is a bijection from 
Homr.moa)(I/I?, N) to Derr(A, Neg). Taking B = A, N = M, « = id 
and putting things together gives the proposition. 


Proposition. Let D € Derr(A, A) be a derivation of the R-algebra A, 
and let }: I/I? — A be the R-linear map corresponding to D as in the 
proposition just proved. Then D is right invariant if and only if \(I/I*) c 
Rl, in which casee,oD = Ao7, and D is the unique invariant derivation 
of A such that D(f) = An(f) (mod I) for all f € I. 


Proof. For each point z € G(B), any B, D(x) := 20 Dis in Der(A, B,). 
We say D is right invariant if (c @ D(z))y = (cy ® D(zy)) for all z,y € 
G(B), any B, or, equivalently, if D(z) = D(eg). for all s € G(B) in the 
notation 6, of the previous paragraph. As usual, this condition will hold for 
all B, x if it holds for B = A, z = id, that is, if D = wo((e40D) @id)om. 
Hence D is invariant if and only ife,o D = Xo. For arbitrary Da 
computation shows that ¢40 D =e,0 07, and the proposition follows. 


§3. FINITE FLAT GROUP SCHEMES; PASSAGE TO QUOTIENT 


Throughout this section, S is a locally noetherian base scheme. An 
S-scheme X is finite and flat over S if and only if Ox is locally free of 
finite rank as Os-module, that is, if and only if there is a covering of S by 
affine open subsets U such that the morphisms X | U — U are of the form 
Spec(A) — Spec(R) with A free of finite rank as R-module. This rank is a 
locally constant function n on S with integer values > 0 which we call the 
order of X over S. 


Notation. We denote the order of X over S by [X : S], and sometimes 
write simply “[X : S] =n” to indicate that X is finite and flat over S and 
that n is its order. 
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3.1 Proposition. (i) Suppose X — Y — S are morphisms of schemes 
and suppose [X : Y| =m and m ts a constant > 0. Then X is finite flat 
over S if and only if Y is, in which case [X : S] = [X : Y][Y : S], as 
functions on Y. 

(ii) If [X; : S| =nNi; t= 1, 2, then [X1 xs Xo : S| =n Ng. 

(iii) If [X: S] =n, then [X xs T:T] =n for every S-scheme T. 


Proof. (i) Since m > 0, X — Y is faithfully flat, hence X — S flat implies 
Y — S flat. Since S is noetherian, X — S finite implies Y — S finite. 
The rest of (i), and (ii) and (iii), are left to the reader. 


Finite flat S-group schemes are our main concern. So far our only ex- 
amples are the constant group schemes Xg attachedto a finite group X 
and their duals D(X)gs for X abelian. Both Xs and D(X) g have the same 
(constant) order as the group X. In particular, for each integer n > 1, 
both (Z/nZ)g and ({n)s have order n. 

If S = Spec(R) is affine then finite flat S-schemes are affine. We are 
ultimately interested in the case RF is the ring of integers in a local field, in 
particular the case R = Z,. Therefore we will limit the discussion to the 
affine case and will often assume G = Spec(A) with A free over R, not only 
locally free, which is automatic if R is a local ring. 

Note that if [G : S] = [A : R] = n, then the augmentation ideal I 
(cf. 2.3) is locally free of rank n — 1 as R-module. This makes the case 

= 2 very easy to analyze. 


(3.2) Example — exercise; G of order 2. Suppose Ff is a ring and G = 
Spec(A) is an affine R-scheme such that [G : R] = 2, with an associative 
law of combination 


m:GxrpG—-G 


for which there is a 2-sided unit e« : S = Spec(R) — G, but not necessarily 
an inverse. Let 


I=Ker(é: A— R). 


Then J is an invertible (= locally free of rank 1) R-module. Assume J is 
free with basis element x,so A = R+ Rz is a free R-module of rank 2 with 
basis {1,2}. The ring structure of A is determined by the element a € R 
such that x” = az. As discussed in (2.3), the comultiplication m must be 
of the form 


m(1)=1@1=1 and m(r) =2@14+182+b(¢@z) 


for some 6 € R. Check that form: A— A@prA to be a homomorphism 
of R-algebras, it is necessary and sufficient that (ab + 1)(ab+ 2) =O in R. 
Assuming that is the case, G is a commutative and associative R-magma 
scheme with two sided unit, representing the functor 


G(B) = {ye Bly = ay} 
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for R-algebras B, with law of composition « in G(B) defined by 


yxz=ytztbyz, 


with unit element y = 0. The elements e; = ab + 2 and eg = —ab — 1 are 
orthogonal idempotents in R whose sum is 1, so S = S, [[ So is a disjoint 
union of open affine subschemes such that ab = —2 on S; and ab = —lon 


Sq. Hence we can without loss of generality treat those cases separately. 
Check that if ab = —2, then G is an R-group scheme with y * y = 0 for all 
y € G(B), all B, but if ab = —1, then G is not an R-group scheme, but is 
a monoid with y « y = y, all y € G(B), all B. 

For each pair of elements a,b in R such that ab = —2, let Ga» denote 
the R-group scheme just introduced. For example, G_21 = (f#2)Rr because 
G_21(B) ={y€ B| (l+y)? =1} andl+y*z=(1+y)(1+z). On the 
other hand, G2 = (Z/2Z)p as is easily checked. 

Check that 


Gap * Gog —> Jue R* such that a = ua and 6B = ub. 


Thus, if 2 is invertible in R, then all G,,’s are isomorphic to the constant 
group scheme (Z/2Z)r. If R = Z or Zo, then Gay % (Z/2Z)pR or (pa)R- 
If R = Z,[2)/1"], then there are exactly 18 types of finite flat R-group 
schemes of order 2, up to isomorphism. If R is an integral domain of 
characteristic 2, then the types of Ga.p’s are: one Goo, and one Gao and 
one Goa for each non-zero principal ideal (a) in R; in particular, if R is 
a field of characteristic 2, there are three types of R-groups of order 2, 
(Z/2Z)r, (u2)r, and (a@2)r = Go,0- 
If u,v,w € Rand uvw = —2, then there are pairings on the functors 


Gawu(B) x Gy,uw(B) ase Guv,w(B) 
given by 


(Yur Yu) > YuYo - 
If w = 1, check that the pairings 


Gig x Gun ies (H2)R Cc (Gm)R 


identify Gu,» with the character group scheme of G, 1 (cf. 2.9). 


(3.3) Passage to quotient by a group scheme of finite order. Let 
H be an S-group scheme and X a scheme over S. A right action of H on 
X is a morphism a: X Xs H — X such that, for every S-scheme T the 
induced map X(T) x H(T) — X(T) is a right action of the group H(T) 
on the set X(T), i-e., satisfies the rules r(hjho2) = (thi )hg and z-1= cz. 
We will say such an action is strictly free if the morphism 


(id,a):X xg HX xg X, 
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ie., the morphism inducing (z,h) +> (z,rh) on the functors, is not only 
injective on the functors, but is a closed immersion. Given a right action 
of H on X we will say that a morphism f : X — Y is constant on orbits if 


foa=fopr:X xsH—-Y, 


that is, if f(ch) = f(x), allx € X(T), hE A(T), all T. 


3.4 Theorem (Grothendieck). Suppose H finite flat over S locally noe- 
therian acts strictly freely on X of finite type over S in such a way that 
every orbit is contained in an affine open set. Then the category of mor- 
phisms X — Z which are constant on orbits has an initial object; in other 
words there exists an S-scheme Y and a morphism u: X — Y constant 
on orbits such that for every morphism vu: X — Z which is constant on 
orbits there is a unique morphism f : Y — Z such thatu = fou. (Of 
course the morphism u: X — Y is then unique up to a unique isomor- 
phism; we denote it by u: X — X/H and call it the canonical morphism 
from X to the orbit scheme or the quotient of X by H.) The morphism 
u:X — Y = X/H has the following further properties: 
(i) X is finite flat over X/H and [X : (X/H)|] =[H: Sl]. 
(ii) For every S-scheme T the map X(T)/H(T) — (X/H)(T) is injec- 
t2ue. 
(iii) If S = Spec(R), H = Spec(B) and X = Spec(A) are affine, then 
X/H = Spec(Ao), where Ap is the subring of A where the two 
homomorphisms pr,,a: A— A@pr B coincide. 


Remarks. This theorem is a special case of results of Grothendieck ([Gro], 
[SGA3, I, Exp.V]). As he realized, the theorem has really nothing to do with 
group schemes; one can replace the morphism (pr,,a): X xsH 3 X xs X 
by any closed subscheme 7 of the product X xg X which is the graph of an 
equivalence relation such that pr, :R — X is finite and flat. As suggested 
by (iii) it is easy to construct the scheme which will be X/H. The difficulty 
is to show that it has the characterizing property and satisfies (i) and (ii) 
as well. For a concise proof we refer the reader to Raynaud’s short article 
[R1], which we recommend also as an excellent introduction to the general 
problem of passage to quotients. For more general results we refer the 
reader to [K-M]. 


(3.5) Quotient group scheme, coset spaces. For us, the main appli- 
cation of 3.4 is the case in which X = G is an S-group scheme, H C G 
is a finite flat closed subgroup scheme, and the actiona: GxsH —-G 
is the restriction of the group law m: Gxg G — G. We call G/A the 
scheme of left cosets of H in G. If G/H is finite and flat over S we call its 
order, [(G/H) : S], the index of H in G and denote it by [G : H]. Suppose 
[H : S] =m is constant. Then m > 0 because H has a unit section. By 
part (i) of (3.4) we conclude that [G : (G/H)] = m and then by (3.2) that 
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G is finite flat over S if and only if G/H is, in which case, 
(G: $]=[G: (G/H)\(G/H) : 8] =(#: SIG: A], 


that is, “order of group = order of subgroup x index of subgroup.” 

The definition of a left action of a group scheme G on a scheme X is 
clear, and it is easy to see that G acts naturally on the left of the scheme 
G/H in a unique way such that the diagram 


GxeG 6 x(ein) 
m| [tee action of G 
G 22s “E77 


commutes. If H is normal in G, ie., if H acts trivially on G/H, then 
we get a morphism G/H x G/H — G/H which makes G/H an S-group 
scheme and u: G — G/H an S-group homomorphism. The sequence 
0 > H — G 5 G/H — 0 is exact in the sense of (1.7) and (1.8) and 
also in the sense that u is faithfully flat and H = Kerwu. Perhaps a simpler 
approach to these matters, one we have been avoiding, perhaps wrongly, 
is that advocated by Raynaud [R1] of identifying a group scheme G with 
the sheaf for the fppf (faithfully flat finite presentation) topology which it 
represents, and using Grothendieck’s theory of faithfully flat descent [Gro]. 
Then the quotient group G/H represents the quotient sheaf, and the exact 
sequence in question is simply an exact sequence of sheaves of groups. 


(3.6) The fundamental group 7(5,a) and finite étale S-group 
schemes. A morphism Y — S is finite étale if it is finite flat and un- 
ramified in the sense that for each point s € S the fiber Y, := Y xg {s} 
is the spectrum of a separable algebra over the residue field «(s) of s, that 
is, Y, is reduced, and for each point y € Y, the corresponding residue field 
extension «(y)/«(s) is separable; in other words, the inequalities in the 
following display are equalities: 


(*) [Y : S](s) == [¥, : {s}] = [(¥e)rea : {5}] 
= So [r(y) : «(s)] 


yeY, 


> > [K(y) : &(S)]sep- 


yeY, 


Let a: Spec(Q) — S be a geometric point of S centered at s, that is, an 
embedding & : K(s) <> 2 of «(s) into an algebraically closed field Q. The 
set Y(a) of geometric points of Y mapping to a has cardinality 


Ss In(y) : #(5)Jsep- 


yes 
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From (*) we conclude that for a finite flat S-scheme Y the inequality 
[Y : S](s) => #Y (qa) holds for a geometric point a of S centered at s, and 
that Y is étale over S if and only if that inequality is an equality for all 
geometric points a of S. 

Let (FEt/S) denote the category of finite étale S-schemes. Here is a 
quick review of the description of (FEt/S) in terms of the fundamental 
group of S. A convenient reference is [MJ]; see also [SGA1] and [Mu]. For 
simplicity we assume S non-empty and connected. Let a be a geometric 
point of S. The fundamental group m = ™(S,a@) of S at the geometric point 
a can be defined as the group of automorphisms of the functor Y ++ Y(a) 
from (FEt/S) to (Sets). An element o € 7a is a collection of permuta- 
tions oy of the sets Y(a), one for each Y € (FEt/S), such that for every 
(FEt/S)-morphism Y — Y’, the induced map Y(@) — Y'(a@) commutes 
with the oy’s. Then az is a profinite group, that is, a compact Haus- 
dorff topological group in which the open subgroups (those which contain 
Ker(a — Perm(Y(qa))) for some Y) form a fundamental system of neigh- 
borhoods of 1. 

Let (Fz-sets) denote the category of finite sets X with a continuous 
action of 7 on them. By construction, each Y(q) is an object in (Fz-sets). 
Grothendieck’s theorem is that the functor Y — Y(a) from (FEt/S) to 
(Fa-sets) is an equivalence of categories. This functor commutes with 
cartesian products and disjoint sums; in particular, expressing Y as disjoint 
union of its connected components corresponds to expressing Y(a) as a 
union of orbits for the action of 7. 

The fundamental group 7(S,a) is a functor of geometrically pointed 
connected noetherian schemes (S,a). A morphism f : T — S induces 
a homomorphism f, : ™(T,8) — 7(S,f(@)) in a natural way so that 
the base change functor Yr Yr = Y xg T from (FEt/S) to (FEt/T) 
corresponds under the equivalence of categories to the process of viewing 
a 71(S, f(8))-set as a 7 (T, B)-set via the homomorphism f,. 

The fundamental group 7(S, a) is determined up to an inner automor- 
phism by the scheme S. If a’ is another geometric point of S, the functors 
Yt> Y(a) and Y + Y(a’) are isomorphic; an isomorphism between them 
is the analog of a homotopy class of paths from a to a’, and induces an 
isomorphism 71(S, @) — 71(S, a’). 

If k = «(s), and a@ is given by the embedding @: k ~ Q, the group 
Aut;,(Q) acts on the left of 2, so on the right of Spec({2), so, for each Y in 
(FEt/S), on the left of Y(a@) = Homscns)(Spec®, Y). This action gives a 
homomorphism of Aut,({2) into 71 (.S, ~) which factors through the quotient 
Gal(k,/k) of Aut,(Q), where k, is the separable algebraic closure of k& in 
Q, and thereby induces a natural homomorphism Gal(k,/k) — 7(S, qa). 
It is a nice exercise in Galois theory to show that if S = {s} = Spec(k), 
this homomorphism is an isomorphism, and the equivalence of categories 
above does hold. The reverse equivalence in this case is given by X 
Spec(Map,(X,k,)), where for a finite set X with a continuous action of 
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nw = Gal(k,/k) on it we denote by Map,(.X,k,) the k-algebra of maps 
X —k, commuting with 7. 

Getting back to our business of group schemes, the upshot of all this 
is that the category of finite étale group schemes over a noetherian base 
scheme S has a simple description. If a is a geometric point of S, the 
functor G++ G(a) is an equivalence of that category with the category of 
finite groups with a continuous operation of 71(S, a). 

Let G be a finite flat S-group scheme. Then G/S is étale if and only if 
the sheaf of relative differentials 02, jg is zero (cf. e.g., [M], Ch.1, Prop.3.5). 


Hence, by (2.11), G/S is étale if and only if Z = Z?, where ZT C Og is the 
augmentation ideal sheaf. Equivalently, G/S is étale if and only if the unit 
section e(.S) = Spec(Og/T) is open (and closed) in G. This is true, because 
if Z = TZ, then J, = (0) for z € Spec(O/T), by Nakayama’s Lemma, hence 
the complement of Spec(O/T) is the support of T and is closed. 

We will soon see that every finite flat S-group G whose order [G : S] is 
invertible on S is étale. 


(3.7) The connected-étale exact sequence over a Henselian local 
ring. In this section we assume S = Spec(R) is the spectrum of a henselian 
local ring R, for example, a field or a complete discrete valuation ring. For 
some basic properties of hensel rings which we use here see for example 
[M,1,§4]. Let M be the maximal ideal of R, k = R/M the residue field, 
and s = Spec(k) the closed point of S. 

Our aim in this section is to prove the following four things about a 
finite flat S-group scheme G. 


(I). Let G° be the connected component of the identity in G. Then G° is 
the spectrum of a henselian local R-algebra with the same residue field as 
R and is a flat closed normal subgroup scheme of G such that the quotient 
in the sense of (3.4), G®* := G/G®, is étale. We call the exact sequence 


(C=C =3 6367 30 


the connected-étale sequence for G. It can be characterized by the fact that 
every homomorphism from G to an étale S-group scheme factors through 
G — G*, and G® is the kernel of that homomorphism. 


(II). If the residue characteristic of R is 0, then G? = S and G = G*. 
If it is p > 0, then the order [G° : S] of G° is a power of p. (It follows 
immediately from this that if [G : S] is invertible in S, then [G° : S] = 1 and 
G = G* is étale over S. The same is true over an arbitrary base scheme, 
by passage to the henselizations (or localizations) of its local rings.) 


(III). lf R =k is a field, and n =([G: S], then G is killed by n, that is, 
z” = 1 for z € G(B), for every k-algebra B. (In the next section, we will 
give Deligne’s proof that a commutative finite locally free group scheme 
over any base is killed by its order.) 
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(IV). If R is a perfect field, the homomorphism G — G* has a section 
and G is a semidirect product, G = G2 x G®. 


Suppose T is a scheme finite over S. Then T = Spec(A) with A a finite 
R-algebra. Since R is henselian we have A = [[j_, A; with each A; a local 
hensel ring ({M], loc.cit.). Accordingly, 


r=|In 
i=1 


is a finite disjoint union of open subschemes T; = Spec(A;), each of which 
is the spectrum of a local hensel R-algebra. In particular, the T; are con- 
nected; they are the connected components of T. For each 2, let t; be the 
closed point of T; and k; = «(t;) its residue field. 

Let a be the geometric point of S corresponding to an algebraic closure k 
of k= «(s). Let w = Aut,(k) = Gal(k,/k). Then 7 acts on 


T(a) = Homp.aig(A, k) = ] | Hom. (ki, k) 
2=1 


through its action on k. The functor T'++ T(a) from finite S-schemes to 
finite 7-sets commutes with products and disjoint unions. From this several 
things are obvious: 


1) The T;(@) are the orbits for the action of x on T(q); in particular, 
T is connected if and only if acts transitively on T(a). 

2) T; xg T; is connected + either T;(a) or T;(q) is a singleton = 
either k; or k; is pure inseparable over k. 

3) The connected components of the closed fiber T, = T xg {s} are 
the closed fibers (T;), of the connected components of T. 


Suppose now G is a finite S-group scheme. Let G° be the connected 
component of G which contains the image of the identity sectione: S— G. 
Then S is a closed subscheme of the local scheme G®° so they have the same 
residue field, k. From 2) above it follows that for each connected component 
G, of G the product G; xs G® is connected. Its image G;G° under the law 
of composition m: G xs G — G is connected and contains G;S = G; so 
is equal to G;. In particular G°?G°® = G®°. Also, the inverse morphism inv 
preserves G° because it is an automorphism of the scheme S preserving 
e. Hence G® is an open and closed subgroup scheme of G. To show it is 
normal in G it suffices to show that the map 


GxsG°—=G, — (9,9°) 99°97", 


has image in G®. This is true because G xs G° = [], Gi xg G®, and for 
each 7 the image of G; xs G° is connected and contains the unit section. 
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Suppose now G is flat as well as finite over S. Then each connected 
component of G is flat, so G° is a flat normal subgroup scheme of G and 
we can form the quotient S-group scheme G® := G/G® as in (3.4), (3.5). 
As remarked in 3.5, the fact that G is flat implies that G@ is flat and 
[G : S] = [G* : S][G° : S]. Since G® is open in G, the unit section 
G°/G° = S is open in G/G° = G*, and this implies G@ is étale, as 
remarked at the end of (3.6). 

To finish the proof of (I), note that there is no non-trivial homomorphism 
of a connected S-group scheme to an étale one, because such a homomor- 
phism would factor through the identity component of the étale one, which 
is the unit section S. Thus a homomorphism of G into an étale S-group H 
has-G° in its kernel, so factors through G/G®. 

Over a hensel local base the functor Y + Y, is an equivalence between 
(FEt/S) and (FEt/{s}); equivalently, the homomorphism 


nm = ({s}, a) — ™(S, a) 


induced by the inclusion {s} <> S$ is an isomorphism. Therefore a finite 
étale S-group scheme H is determined by the a-group H(a) which can be 
an arbitrary finite group on which a acts continuously. 

The 7-group corresponding to G® is G(a@). Indeed, the homomorphism 
G(a) — G* (qa) is surjective because G is finite over G®, and is injective 
because its kernel G°(@) has only one element. 


Segment IT. To prove (IT) we can assume G = G° is connected and R = 
k, a field. Then G = Spec(A), with A a finite dimensional local k-algebra. 
The maximal ideal of A is the augmentation ideal J and is nilpotent. Let 
{zi;},1<i<r=dim,(JI/I*), bea family of elements of J whose residues 2; 
form a basis for the k-vector space I/I*. By (2.11), there exist right invari- 
ant derivations Dj: A— A,i<i<r, such that D;z; = 6;; (modJ). By 
the product rule we have D,I” C I’~, and the D,’s induce derivations D; 
of degree —1 on the graded ring Gr;(A) = @S2pl”/I’*! = k[Z,,... , Z|. 
Let X;, 1 <i <r, be independent variables, k[X] = k[X1,...,X,] and 
yp: k[X] — Gr7(A) the k-algebra homomorphism given by p(X;) = 4. 


Lemma 8.7.1. Ifchar k = 0, then 1s an isomorphism; if char k = p > 0, 
then ~ induces an isomorphism k[X]/(X?,...,X?)—>Gr;(A)/(Z4,...,Z?). 


Proof. We have Diy = Oar: because these two k-linear derivations co- 
incide on the generators X; of k[X]. Let J = Kerg if char k = 0 and 
J = go l(zi,...,?) if chark = p > 0. Then J is a homogeneous 
ideal in k[X], stable by ak. for each 7, not equal to k[X], and contain- 
ing: (X25 00. 5X?) itchar k= p> 0. Let P= Gy Ay AL EW. 
Then 


vylvgl---Upley, ... vp = ((ax) mais G ) P) (0,0,... , 0) 
1 Tr 
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is in J because J is homogeneous. Since 1 ¢ J it follows that c,,. », =0 
for all (,...,v,) if char k = 0, and for (4,... ,v,) such that 4%; < p for 
each i, if char k = p > 0. Thus J = (0) or J = (X?,... , XP) in the two 
cases, as claimed. 


Lemma 3.7.1 proves (II) if char k = 0 because k[X] is finite dimensional 
only if r= 0, k[X] =k. If char k = p > 0 we use induction on the order of 
G and 


Lemma 3.7.2. Suppose chark = p> 0. Let B= A/(ai,...,2?). Then 
H = Spec(B) is a finite flat normal subgroup scheme of G of order p’. 


Proof. The closed subscheme H C G is flat because k is a field. It is a 
normal subgroup scheme because it is the kernel of the Frobenius homo- 
morphism F': G — G). Recall that for a scheme X over k, X‘?) denotes 
the base change of X from k to k corresponding to the homomorphism 
z+ 2? of k into itself. For a k-algebra B we have X‘?)(B) = X(B’), where 
B’ denotes the ring B, viewed as k-algebra with elements c € k acting on 
B’ via the p-th power of their action on B. The map F’': B — B’ defined 
by F'(6) = 6? is then a k-algebra homomorphism, and the corresponding 
homomorphism of functors 


F, : X(B) — X(B’) = X®)(B) 


induces a morphism of k-schemes which is given by raising the coordinates 
of a point to the p-th power and which we denote by F: G > G®), 

If X = Spec(k[z1,... ,2,]) is an affine k-group scheme with augmenta- 
tion ideal J generated by the coordinate functions ;, as is the case with our 
G, then F is a group homomorphism (because F', above is), and Ker F is 
represented by the closed finite flat subscheme Spec(k[r1,... ,£,]/(27,--- , 
z®)), which is therefore a normal subgroup scheme. For more on Frobenius 
maps see [SGA3, Exp. VII, ,4]. 

Statement (II) in case char k = p > 0 now follows. If r = 0, ie., [G: 
k] = 1 there is nothing to prove. Otherwise, in the notation of Lemma 3.7.2, 
[H :k] =p" >1 and [G: k] =p"[(G/H) : k]. By induction, [(G/H) : k] is 
a power of p, so the same holds for G. 


Notation. If G = Spec(A) is a group scheme and m an integer, we let 
[m] : A — A denote the homomorphism corresponding to raising to the 
m —th power in G. Thus, ([m](f))(x) = f(2™). 


Lemma 3.7.3. Suppose the ground ring R satisfies pR = 0 for some prime 
p. Let G = Spec(A) be a finite free R-group scheme, or more generally, a 
closed R-subgroup scheme of (GLn)r for some n, with augmentation ideal 
I. Then [p|I c I?. 


Remark. I learned this lemma and its very simple proof from a preprint of 
F. Andreatta and R. Schoof in which they use it to prove that a finite flat 
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group scheme over the ring of dual numbers k[e] (k a field, €? = 0) is killed 
by its order. They tell me that they learned it from Bas Edixhoven. 


Proof. Let U = (uj) be an n x n matrix with independent indeterminate 
entries ujj. Then (GLln)r = Spec(B) where B = R[u,;,1/det(U)]. If 
G = Spec(A) is a finite flat R-group scheme such that A is a free R- 
module of rank n, then the action of G on itself by translations gives an 
imbedding of G as a closed subgroup scheme of (GL,)r — the “regular 
representation” of G. If (f;), 1 <i< nis a basis for A over R, such an 
imbedding is given by the homomorphism of R-algebras ¢ : B — A such 
that ¢(uij) = aij, where the a;; are defined by m(f;) = )7j_, fi®ai;- (This 
is a representation using right translations; if y € G(R’), R’ an R-algebra, 
then the automorphism 7, of A@pr R’ = >> f; ® R’ discussed in (2.11) is 
given by 7,(f;) = >> fiaij(y).) The homomorphism ¢ is surjective because 
fj = 2 &(fi)aiz for each 7. 

Suppose more generally that G = Spec(A) is any closed subgroup scheme 
of (GIn)r and ¢ : B — A the corresponding homomorphism. Let J 
be the augmentation ideal in B, generated by the entries of the matrix 
U —I, = (uizj — 6:3) = (viz), say. We have U? = ((p](ui;)). Therefore 
(((p](vij)) = ([pl(uig)) — (6g) = U? — In = (U — In)? = (vig)?, which shows 
that [p|J Cc J. Since J is the inverse image of J under the surjective map 
@B — A, I is the image of J in A and the lemma follows. 

We can now prove (III). If H = G/N and H, G and N are finite flat R- 
group schemes, and H and WN are killed by their orders, then so is G. The 
equivalence of categories discussed at the end of (3.6) shows that a finite 
étale group scheme is killed by its order. Hence to show that G is killed 
by its order it suffices by (I) to show that its connected component G® is 
killed by its order. Suppose therefore G is connected and R = k, a field. 
By (IT) we can suppose the characteristic of k is p > 0 , and the order of 
G is g = p™ for some m. Then Lemma 3.7.3 applies, and in the notation of 
that lemma we have [p|(I) Cc J?. Iterating m times gives [g](I) c I%. But 
in an Artin local ring of length gq with maximal ideal I one has I? = (0). 
Hence [g](I) = (0). This means that [g](f) = f(1) = [O](f) as claimed 
in (IIT). 

To prove (IV) we assume k is a field of characteristic p > 0 and G = 
Spec(A) is a finite k-group scheme. Let N be the nilradical of A, so Grea = 


Spec(A/N). Suppose Grea is étale over k, which is automatic if k is perfect. 
Then Grea X Greg is reduced so that the map 


Gred X Grea 9 GX G—->G 


factors through Greg and induces a k-group scheme structure on Greg. Let 
a = Spec(k) as in the beginning of 3.7. The isomorphisms Grea(@) = 
G(a) = G*(a) show that the restriction to G,eq of the map G — G* is an 
isomorphism. Hence G ~ G® ™ Greg is a semidirect product, as claimed. 
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(3.8) The dual Hopf algebra and Cartier duality. We denote the 
dual of an R-module M by M’ := Homr.mog(M, R), and sometimes write 
(m',m) = m'(m) for m € M and m’ € M’. If M and N are locally 
free of finite rank, then the natural homomorphisms M ~— (M’)’ and 
M®WN' — (M@N) are isomorphisms. Thus M — M’ is an anti- 
equivalence of the category of all such modules with itself, which commutes 
with tensor products. It follows easily that if G = Spec(A) is a finite flat 
R-group scheme, then the dual A’ of the commutative Hopf algebra A is a 
cocommutative Hopf algebra in which the multiplication A’ @ A’ — A’ is 
dual to the comultiplication m: A — A@A and vice versa, the unit element 
in A’ is the counit €: A — Rin A and vice versa, and the antipodes are 
dual to each other. We leave the details to the reader with the following 
remarks. That the dual of an algebra is a coalgebra and vice versa is easily 
checked. The requirement that the comultiplication and counit should be 
unitary algebra homomorphisms can be expressed as the commutativity of 
the following diagram. 


R@R—=— s@gAa —*2" —, (4@ A) @(A@A) 
172,3 

| a (4@ A) @(A@ A) 
mult 
R —— ae A@A 


(Here 72,3 is defined by 72,3(41 @ 2 @ 23 @ £4) = F1 @ T3 @ 2 @ 4.) The 
symmetry of the diagram upon reversing arrows shows that the condition 
holds in A’ since it holds in A. 

If G = Hp is the constant R-group scheme associated with a finite group 
H, then A is the ring of R-valued functions on H, and A’ = R[H] is the 
group algebra of H over R. The pairing is the obvious one: 


( rt f) = tf) 


xzeH xeH 


In the general case it is useful to think of A and A’ as the analogs, respec- 
tively, of the ring of functions on G and the group algebra of G. In fact, an 
element f € A does give a function fg on G(B) with values in B for each 
R-algebra fg, if we put fg(x) := x(f) for each point c: A — Bin G(B), 
and f is determined by these functions, in fact by f4, since f = fa(id) , 
where id € G(A) = G(G) is the identity map. 

Although A’ is not the group algebra of G(R), nevertheless the inclusion 


G(R) = Hom, r-aig)(A, R) Cc Hom, mod) (A, R) atuAt 
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identifies G(R) with the multiplicative group of group-like elements of A’, 
that is, the group of invertible elements in A’ such that A is mapped 
to 4 @X by the comultiplication in A’ . This is routine to verify. The 
group law in G(R) is given by multiplication in A’ because it is dual to the 
comultiplication m in A. Again by duality, an element A € A’ is group-like 
if and only if it is invertible in A’ and the map A: A — Ris multiplicative. 
(Assuming lambda is multiplicative, one checks that it is invertible if and 
only if A(1) = 1.) 

The formation of the dual Hopf algebra A’ commutes with base exten- 
sion; for each R-algebra B we can identify A’ @p B with(A @pr B)’, where 
the second prime (’) is relative to the base B. Thus G(B) is the group 
of group-like elements in the Hopf -algebra_A), := A’ @g-B over_B, for 
each R-algebra B. We denote the B-linear pairing AZ x Ap — B by 
(,)g. Then for f € Aor Ag and z € G(B) C As the value of f at z is 
fa(z) = (2, f)p € B. In particular, for f € A and id € G(A) Cc A’ @ A we 
have f = fa(id) = (id, f @1)4. 

Let A € G(R) c A’, and let r : A — A be the transpose of right 
multiplication by \ in A’. For p € A’ and f € A we have 


and the same holds after base extension, that is, (T,(f))s(x) = fa(xA) for 
z € G(B) C A’ @prB, all B. Thus 7) is the automorphism of the R-algebra 
A corresponding to right translation by lambda. 


Proposition 3.8.1. In the group of automorphisms of the left A’-module 
A’ @RpA, let rT := id4 @7y, p := right multiplication by id and £:= right 
multiplication by X@1. Then rpr—!p7} = £. 


Proof. Taking ¢ = 7) in the lemma below we find that tijjg) = &(id) = 
id-(A @1). Hence, for X € A’ @p A we have (since 7), hence also r, is a 
ring automorphism) tp(X) = 7(X -id) = r(X)-r(id) = 7(X)-id-(A@1) = 
Lor(X). 


Lemma 3.8.2. Let @: A— A be an R-linear map and let ¢’ : A’ — A’ be 
its transpose. Then (id 4 @¢) (id) = (id4 @¢’) (id). 


Proof. We leave this bit of linear algebra to the reader. In fact, each side 
of the stated equality is equal to the element of A’ @ A which corresponds 
to ¢ and to ¢’ under the canonical isomorphisms A’ @ A = Endpr(A) = 
Endp(A’). 


The left A’-module A’ @ A is free of rank n := [G : R], the order of 
G and (3.8.1) shows that the “constant” matrix AI, is a commutator in 
the group GL,,(A’). If A’ is commutative, that is, G is commutative, then 
we can use the determinant homomorphism GL,,(A’) — (A’)* to conclude 
that A” = 1. The same holds for X € G(B) C A’ for an arbitrary base ring 
extension R — B. Thus a commutative finite flat group scheme is killed 
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by its order. The above is Deligne’s proof of that fact, presented perhaps 
in a less comprehensible way than in [O-T]. 

I do not know whether a non-commutative finite flat group scheme is 
killed by its order. This is true if R = k is a field (cf. (3.7)III), and 
hence if R has no nilpotent elements. As remarked (loc. cit.), Andreatta 
and Schoof have proved it for the ring of dual numbers R = k[e], e? = 0. 

Suppose now G is commutative. Then A’ is commutative, so that G’ := 
Spec(A’) makes sense and is a finite flat commutative R-group scheme of 
the same order as G. The functor G++ G’ is an anti-equivalence of the 
category of finite flat commutative R-group schemes with itself, such that 
(G’)’ is canonically and functorially isomorphic to G. This Cartier duality 
isa vast generalization of the classical duality of finite abelian groups. As 
explained in (2.9), the group-like elements of A, are the characters of G’ 
defined over B, and it follows from the above that G is the character group 
scheme of G’ in the sense of (2.9), that is, represents the functor 


Bt— Homver/sy(G'g; (Gm)z) 


By symmetry, G’ is the character group scheme of G. For each R-algebra 
B, the pairing 


G(B) x G’(B) — Gn(B) = B* 


is given by the symbol (,)s, if we imbed G(B) and G’(B) in A and 
Ag respectively as above. On the other hand, if we view these pairings 
as a bimultiplicative invertible function on G xp G’ = Spec(A @p A’), 
the function is the element id € A @ A’ = Endp(A) corresponding to the 
identity map of A, because (A @ f,id) = (A, f), as one easily checks, and 
the same holds after base extension. 


An application: Why non-abelian simple groups are étale. Serre 
and Raynaud have explained to me why finite flat group schemes which 
are very non-commutative tend to be étale. The point is that if G/S is 
not étale at a point s € S, then p = char(«(s)) is not 0, and over the 
henselization R” of the local ring R = Og, of s, the connected component 
G° of G is a normal subgroup scheme of p-power order, the normality of 
which works against non-abelianness. For example, suppose S$ is a normal 
scheme with field of fractions K and the general fiber Gx of G is étale 
(which is automatic if char(K) = 0). Let K be an algebraic closure of K. 
Then in the situation just discussed, we will have R C R" c K, and if G is 
not étale at s, then, for p = char(x(s)), the finite group G(K) will have a 
non-trivial subgroup G°(K) which is of p-power order by (II) above, and 
is normal (and also stable under the action of the decomposition group 
m,; = Aut(K/R")). If G(K) has no such subgroup, then G is étale at s. 
Thus if G(K) has no normal p-subgroup for every prime p, for example, if 
G(K) is a non-abelian simple group, then G/S is étale. 
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§4. RAYNAUD’S RESULTS ON COMMUTATIVE p-GROUP SCHEMES 
This part is taken entirely from Raynaud’s great paper [R2| 


(4.1) Prolongations. In this section we assume for simplicity that our 
ground ring R is a discrete valuation ring of mixed characteristic. Let K 
be its field of fractions, 7 a prime element, k = R/wR the residue class 
field, p the residue characteristic, v the normalized valuation (u() = 1), 
and e = u(p) the absolute ramification index. 

Let Go = Spec(Ag) be a finite commutative K-group scheme. By a 
prolongation of Go (to Spec R) we mean a finite flat R-group scheme G 
whose generic fiber is Gp. The isomorphism classes of prolongations of Go 
are represented by the R-group schemes G of the form G = Spec(A), for A 
a finite R-sub-algebra-of Ap, containing R and spanning Ag, such that 
c(A) C A @p A, where c: Ag — Ap @ Ag is the comultiplication in Apo. 
(Exercise: Why is the existence of an inverse automatic in this situation?) 

Let G? = Spec(A’) be the Cartier dual of Gp and let 


(-,:): Ag x AP — K 


be the canonical bilinear map. The Cartier dual of a prolongation G = 
Spec(A) is G? = Spec(A”), where A? C AP is the complementary module 
to A, that is 


AD := {EAP : (A, f) € RB for all f € A}  Homg_moa(A, R). 


The multiplication A? @x AP — AP is the transpose of the comultiplica- 
tion c. Hence the condition c(A) C A @pr A is equivalent to A? > A? AY; 
a finite R-submodule A of A° which contains R and spans A® is the ring 
of a prolongation if and only if both A and its complementary module A? 
are closed under multiplication. 

The prolongations of Go are partially ordered. If G = Spec(A) and 
G’ = Spec(A’) are two prolongations, we write G > G’ if A > A’, that is, 
if there is a morphism G — G’ (necessarily unique) inducing the identity 
on Go. 


Proposition 4.1.1. Two prolongations of Go have a sup and an inf. 
Proof. If G’ = Spec(A’) and G” = Spec(A”) are two prolongations with 


A’, A” Cc Ap, let A = A’A” be the R-algebra generated by A’ and A”. 
Then 


c( A) = c(A’)c(A”) G (A’ @ A’)(A” @ A”) ae Al A” @ A’ A” = A@QA. 


Hence G = Spec(A) is a prolongation of Go and it is obviously a least upper 
bound for G’ and G” in the partially ordered set of all such prolongations. 
Cartier duality reverses order, so inf(G’,G”) = (sup(@’?, Gy)” is a 
greatest lower bound. 
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Corollary. If Go has a prolongation, then it has a mazimal one Gt and 
a minimal one G~. 


Proof. G* exists because the rings of prolongations are R-orders in the 
separable K-algebra Ag, so are all contained in the maximal order, the 
integral closure of Rin Ap. By duality, there is also a minimal prolongation. 


(4.2) Dévissage. Let 
0— G— Go — GG — 0 
be a short exact sequence of finite K-group schemes, and let 
Ag «— Ap += AG 


be the corresponding K-algebra picture. Suppose G = Spec(A) is a prolon- 
gation of Go. Let A’ be the image of A in Aj and put G’ = Spec(A’), the 
“scheme-theoretic closure of Gi in G.” Obviously G’ is a closed subgroup 
scheme of G prolonging G4, and is the unique one such that the inclusion 
G’ Cc G extends the given G’ — Gp. The quotient G” := G/G’ is a prolon- 
gation of Gj = G/G’. By induction on the order of G, this proves part (a) 
of: 


Proposition 4.2.1. Suppose G is a prolongation of Go and (Gy pee 
is a composition series for Go. 

(a) There exist unique prolongations G™ of the GG ) such that the inclu- 
sions Go) cS Gor) induce closed immersions G9) c GG+)), The quotient 
GG+) /G@ is a prolongation of Gut vie) for each j. 

(b) Suppose H is another prolongation of Go, such that G > H. Then 
GI+D /GO) > HO+D/HG) for each j, and if equality holds for the quo- 
tients for each 7, then G = H. 


Proof. (of (b)) By induction on the length n of the series, it suffices to 
consider the case n = 2, in which case we have a diagram 


(feet) as Oe CO) eG 


| | | 


0 ——. HW —_-, H —_> H/HW —___., 0 


with exact rows, and must conclude that the central vertical arrow is an 
isomorphism if the outer two are. This is true in an abelian category, hence 
also for our group schemes which form a full subcategory of a category of 
sheaves. In our special case, another way to see this is to note that the 
discriminant ideal of G is determined by the discriminant ideals of the 
quotients GI) /GG+1), cf. [O-T]. 
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(4.3) F-vector space schemes. Let K be an algebraic closure of K and 
G = Gal(K/K). Let G be a finite fat commutative R-group scheme. Since 
char(K) = 0, the generic fiber Go = Gx is determined by the G-module 
Go(K) = G(K). Let us call G simple if Go is simple, or what is the same, 
if G(K) is a simple G-module. 


Lemma 4.3.1. Suppose R is strictly henselian and G is simple of p-power 
order. Then the pro-p Sylow subgroup P of G 1s normal, with procyclic 
quotient Giame, and G acts on G(K) through Gtame- 


Proof. This is well-known; G is an inertia group and P the ramification 
group fixing the maximal tame extension of K. The subgroup of G(K) of 
points fixed-by P is stable.by G since P is normal, and is of order divisible 
by p by the usual counting argument, hence is all of G(K) by simplicity. 


Suppose G is simple and G acts on G(K) through an abelian quotient 
group, as in the lemma. Then G(K), as simple module over the com- 
mutative ring Z[G], is a 1-dimensional vector space over a residue field F 
of Z(G]. This field F is necessarily finite, having the same number of ele- 
ments as G(K), i.e., as the order of G. Scalar multiplication by any element 
of F* commutes with the action of G on G(K) and therefore induces an au- 
tomorphism of the K-scheme Gp. These automorphisms may not extend 
to G, but they certainly extend to the maximal and minimal prolonga- 
tions Gt and G~ of Go, by “transport of structure.” Thus Gt and G™ are 
F-module schemes over FR in the sense of the following paragraph. 

Let F be an associative ring with unity. By an F-module scheme over 
a base scheme S one means a commutative S-group scheme G together 
with a unitary ring homomorphism F' — End(G). This makes G(T) an 
F-module for every T over S, functorially in T, and in this way an F- 
module scheme is the same thing as a representable functor from (Sch/S) 
to the category of F-modules. As a matter of notation, if G = Spec(A) 
is an affine F-module scheme, we denote by [¢]: A — A the Hopf algebra 
endomorphism corresponding to the action of t on G. Thus 


((]f)(x) = f(tx) for f € Aand z € G(T), any T. 


If F is a finite field, we will call an F-module scheme which is finite flat 
of the same order as F' a Raynaud F'-module scheme. The discussion above 
proves: 


Proposition 4.3.2. Suppose Go is a simple commutative K-group scheme 
of p-power order which has a prolongation G. Suppose R is strictly hensel- 
ian or, more generally, that G acts on G(K) through an abelian quotient 
group. Then 

F := End(Go) = End(Gt) = End(G7) 


is a finite field and Go, G*, and G~ are Raynaud F-module schemes. 
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(4.4) The classification theorem.. Let F' be a finite field of characteris- 
tic p and denote by g = p” the number of elements of F. Let u = g—1(K) 
be the group of (gq — 1)* roots of 1. Let M = Hom(F"*, u) be the group 
of “multiplicative characters” of Ff’, and extend each x € M to all of F 
by putting x(0) = 0. If uw Cc R, then there are r special elements x € M, 
called fundamental characters, such that the map 


F+~,R—+k=R/aR 


is a homomorphism of fields. If x is one such character, the others are x” , 
0<i<-r,and y? = x. Hence the fundamental characters form a set 
(xi)zez indexed by a principal homogeneous space Z over Z/rZ, in such a 
way that yiz1 = x3. 


Theorem 4.4.1. Let f,u and (xi)iez be as above and suppose p C R. 
(a) Let (6:)ier be elements of R such that 0 < v(6;) < e = u(p) for each i. 
Let A be the R-algebra presented by generators X;, 1 € T, and relations 
XP = 6;Xi41, i © I. Then there is a unique F-module structure on the 
R-scheme G = Spec A such that [s]|X; = xi(s)X; for eachi € ZT ands e€ F, 
and with that structure G is a Raynaud F'-module scheme. 

(b) Every Raynaud F'-module scheme over R 1s of the type described in (a). 
(c) Suppose G and G’ are two such schemes defined, respectively, by the 
equations XP = 6;Xi41 and the equations X{? = 6,X},,. The F-module 
scheme homomorphisms G' — G correspond to the R-algebra homomor- 
phisms of the form Xi ++ a;X{, where (a;)ier is a family of elements of R 
such that ai416; = a? 6; for eachi € T. 


Proof. We prove (b) first. Let G = Spec A be a Raynaud F-module scheme 
over R. The geometric generic fiber Gg = Spec(A @p K) is the constant 
scheme G(K) z of the 1-dimensional F-vector space G(K) and is therefore 
F-isomorphic to the constant scheme Fy. Choosing an isomorphism, we 
can view A @p K as the ring of K-valued functions on F, which has a K- 
basis consisting of the constant function 1 and the multiplicative characters 
x € M, or, what is the same, consisting of the monomials 


Il: 0<%<p-—Il, 
tET 


if we interpret |], x? as 1. Since R contains (¢—1)~! and the (q—1)* 
roots of 1, the action of F* on the augmentation ideal I of A gives a direct 
sum decomposition 


T=Q@,, where L,={f EI: [s]f =x(s)f for all s € F*}. 
xeEM 


Each of the R-modules I, is of rank 1 because I, @ K = K,,. For each 
fundamental character x;, choose a basis element X; for [,,, and write 
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Xi = Gxi, G € K*. Since obviously I? C Ie for each x, we have A= 
6;Xi41 with 6; = c?/ci41 € R. Hence the R-subalgebra of A generated by 
the X; has a basis consisting of the monomials 


Tx = [Te - Tx 


tEL tet tEL 


To show this algebra is all of A, we consider the Cartier dual G? = 
Spec(A”). Identify A? @ K with the group algebra K[F]. Denoting by A° 
the basis element of K[F] corresponding to s € F’, so that ASt* = A*)*, the 
linear duality is given by (A°, f) = f(s), and it is easy to check that the 
basis of the augmentation ideal in K [F] which is dual to the basis (x),em 
of [@ K is then (e,)yem, where 


1 -1 s 
ey = rea (s)(A* — 1). 


Put Y; = c;*e,,. Then thos = RY;, and there are constants 7; € R such 
that Y? = 7:Yi41. For each x = [[x7*,0 <4; <p-1, not all y%;, =0, we 


have 
[yw [[ <7) = (fe. [x = (etx) =u 


Vib: = (rn) = (ens X41) = Wi; 


say, with w, and w; in R depending only on F’, i.e., on G, not on G. 
These w’s are unchanged by automorphism of F’, hence wy, = wy, and 
w; = w is independent of i. By considering (Fz)? mod p, Raynaud proves 


say, and 


wy = [[:! (mod pR) and w=-—p(modp*R). 


tEL 


Since the wy are units in R, it follows that R[(X;)iez| and R[(Y;)iez] are 
complementary modules contained, respectively, in A and A? and therefore 
equal, respectively, to A and A”, since A and A? are complementary. This 
proves (b) 

Let E be the field of (q — 1)** roots of 1, and let O be the local ring of 
a place of EF above p. Obviously the congruences for the w, and w above, 
which express properties of the constant scheme Fo, are the key to the proof 
just given — the rest is formal. Another approach to these congruences, 
which gives explicit expressions for the w’s, uses the self-duality of F' given 
by an additive character #(s) = C1), where C is a primitive p*® root of 1. 
This duality gives an isomorphism Foc) > FX 0) which, over E'(C), is given 
on the Hopf algebras by e, ++ T(x) x, where 


r(x) =z x N(YO) — 1) = He): 


seF 


FINITE FLAT GROUP SCHEMES 151 


From this it follows that w = 7?—1, where r = r(x:), any i, and that, 
for x = ||, x7* as usual, 
pM 


x= Ey)” 


Up to powers of g — 1, the r(x)’s are Gauss sums and the w,’s are Jacobi 
sums. From this point of view, the congruences above are equivalent to 
classical results of Stickelberger on the leading term of the (¢ — 1)-adic 
expansion of Gauss sums in the ring O[¢]. 

We now turn to the proof of 4.4.1(a). Let 6;, A = R[(X;)], and G = 
Spec(A) be as in (a). Fix j € Z. Eliminating X; for i #7 in the system of 
equations X? = 6;Xj41, one finds that the map P++ X;(P) is a bijection 
between G() and the set of solutions x € K of the equation x? = Ajz, 
where A; = [[pn5 ne From this it is easy to see that the set G(K) has 
a unique structure of (1-dimensional) F-vector space such that X;(sP) = 
xi(s)Xi(P) holds for s € F and P € G(K). 

Choose a point P # 0 in G(K) and put c; = X;(P). Note that the 
action of G = Gal(K/K) on G(K) is F-linear via the homomorphism 
@:G — F* for which x(¢(o)) = o7~*. Viewing A as a ring of functions 
on F via the isomorphism s ++ sP from F to G(K), so that X; becomes the 
function c;x;, one checks as in the proof of (b), using congruence properties 
of the w’s, that A is stable under the comultiplication in Fg, because its 
complementary module A @ x K is the ring R[(Y;)] with Y; = cy 'e,,. 

Part (c) is left to the reader. 


In case gq = 2, F = Z/2Z, the results of §3.2 give a generalization 
of Theorem 4.4.1. The corresponding generalization is proved in [R] for 
arbitrary g. Let R be any local ring of residue characteristic p which is an 
algebra over the ring O above. Then Raynaud F-module schemes G over R 
are described by families 6;, yi, 1 € Z, of elements of R such that 7,6; = w 
for each i, as follows. We have G = Spec(A) and G? = Spec(A”), where 

A=R(Xi)]; XP=6Xy1 and AP = RY); YP = Hans. 
For y € M, put x = [[x#, 0 < % < p—1, not all y% = 0, and put 
X, = [| X}* and Y, = []Y;“. The pairing A x A? — R is the one for 
which (X,)yem and (wy'Y,)yea¢ are dual bases of I and I”. The comulti- 
plication in A is most easily described as the transpose of the multiplication 
in AP. The explicit formula for it, given in [R], is a bit messy, involving 
not only the wy lbs but also some products of the +;’s which come from 
using the relations Y? = 7:Yi,1 as necessary to express a product YY, 
of two basic monomials as a constant times Yyyv. 

The proof of the generalization is not hard. For existence, for example, 


one simply redoes the proof of 4.4.1(a) in the following generic situation. 
Let K = E((6;)) be the field of rational functions in r variables 6; over E and 
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R = Of6:;, yi], where 7; = w/6;. Then exactly as in the proof of 4.4.1(a), 
one shows that 
Spec(RLX;]; X? = 6;Xi41) 


is a Raynaud F-module whose extensions give the ones which were to be 
constructed. 


(4.5) Applications. In this section we prove two fundamental results of 
Raynaud which are used in [C]. 


Theorem 4.5.1. Suppose u(p) <e—1. (There are no other assumptions 
on R other than that it be a discrete valuation ring of mixed characteris- 
tic (0,p).) Let G be a commutative finite flat R-group scheme of p-power 
order. Then-G is, up to isomorphism, the unique prolongation of tts generic 
fiber Gr. 


Proof. If G is not unique, then Gt > G~. Such a strict inequality is 
preserved by passage to an extension of R, so we can assume without 
loss of generality that A is strictly henselian. Also, by dévissage (Propo- 
sition 4.2.1(b)), we can assume that G is simple. By Proposition 4.3.2, 
then, Gt and G~ are Raynaud F-module schemes for the finite field 
F = End(Gx). Since Ris strictly henselian, » C R. Therefore G* and G~ 
are Raynaud F-module schemes of the type described in Theorem 4.4.1. 
Taking G’ = Gt and G=G™ in Theorem 4.4.1(c), we find that there are 
constants a; € R, a; # 0, such that 6; = aPa;,',6 for i € Z. Choosing 
an 7 such that v(a;) is maximal, we conclude e > (p — 1)vu(ai), which, if 
p—1> e, implies v(a;) = 0. Hence all a; are units in R and G = G’ as 
was to be shown. 


Corollary. Write C for the category of commutative finite flat R-group 
schemes of p-power order. Let G and H be objects inC. If p—1> e, then 
the natural map 


Home(G, H) —+ Homg(G(K), H(K)) 
is bijective and the natural map 
Exte(G, H) —+ Extg(G(K), H(K)) 
as injective 
Proof. The injectivity of the map on Hom’s is obvious and doesn’t re- 
quire e < p—1. For surjectivity, let Go and Ho be the generic fibers and 


ug: Go ~ Ho a homomorphism. We must show up has a prolongation 
u:G— H, assuming e < p—1. There are homomorphisms 


G — G/(Ker(uo) in G) — (Image(up) in G) — H, 


where (X in Y) is a temporary notation for the scheme-theoretic closure 
of X in Y (see the discussion before Proposition 4.2.1). The key point is 
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the isomorphism in the middle. Both groups it connects prolong the same 
general fiber Image(ug) and they are therefore isomorphic, by the theorem. 
Also by unicity of prolongations, an exact sequence 


CCS GSC" 


of prolongations is determined up to isomorphism by the sequence of its 
generic fibers, so the Ext map is injective 


Which K-schemes Gp have a prolongation? In case Gp is a Raynaud 
F-module scheme and yp C R, the classification theorem gives the following 
-answer. 


Theorem 4.5.2. Let F be a finite field with q = p” elements. Suppose 
be := Mg-1(K) C K and let x; : F* — yp be a fundamental character. Let 
Go be a Raynaud F-module scheme over K and 6: G — F* the character 
giving the action of G = Gal(K/K) on Go(K). Then Go has a prolongation 
(or, as one says, the representation G — GL,(F) is flat”) if and only if 
there is an element A € K* such that 


xi(O(a)) =(AM@-Y)7-1 and = oA) = So ngp*, 
k=0 


with integers ny, in the range 0 < nz <e. 


Proof. Suppose Gp has a prolongation G. Replacing G by Gt if necessary, 
we may assume G is a Raynaud F-module scheme over R, hence is of 
the type described in Theorem 4.4.1(a) via equations X? = 6;Xj4,. Let 
x; = X;(P) for some point P € G(K). The relations z? = 6;2;,, imply 
zi = Aja;, where 

-1 


Ai = 6;_,6P » nae 6P 


i-r * 


Choosing P #0, we have z; #0, hence att = A;. On the other hand, 


zy =xi(P(c))z; fora EG, 


so the condition of the theorem is satisfied with 6 = 6;. 


Conversely, given A with v(A) as in the theorem, we can construct 6’s 
giving a prolongation by putting 6;-1;-, = m™* for 2 << k <r-—1 and 
defining 6;_, by the equation A; = 6;_,6? .--- ge 

The theorem just proved applies in particular in the case R is strictly 
henselian and Gp simple of p-power order, and gives immediately in that 
case a result conjectured by Serre which was a main motivation for Ray- 
naud’s work, and which, in case R = Zp, is Theorem 1.7 of [C]. 
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THREE LECTURES ON THE MODULARITY OF jz 3 
AND THE LANGLANDS RECIPROCITY CONJECTURE 


STEPHEN GELBART 


WILES’ work on Fermat’s Last Theorem is based on methods due to 
FALTINGS, FREY, LANGLANDS, MAZUR, RIBET, SERRE, TAYLOR, 
and others. My purpose in these Lectures is to explain how the (automor- 
phic representation theoretic methods and) results of LANGLANDS come 
into the proof, and how these results themselves are proved. An Introduc- 
tion to each of the Lectures describes more of the topics discussed; but the 
titles already speak for themselves: 


Lecture I: “The Modularity of g¢,3 and Automorphic Representations of 
Weight One” 


Lecture ITI: “The Langlands Program: Some Results and Methods” 
Lecture ITT: “Proof of the Langlands-Tunnell Theorem” 
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Lecture I 
The Modularity of p¢.3 and Automorphic 
Representations of Weight One 


Abstract 
The following result plays a small but key step in Wiles’ proof of the 
Shimura-Taniyama-Weil Conjecture: 


Proposition 1.4. For an elliptic curve E over Q, let 
PE : Gg —> Gla(Fp) 


denote the natural representation of Gg =Gal(Q/Q)- on-the-points-of E(Q) 
of order p. Then if p = 3, and pg,3 1s trreducible, it must also follow that 
pr,3 18 modular, 1.e., there exists a normalized eigen-cuspform 


co 
f(z) = > Ger 
n=1 


of weight two, and a prime X of Q containing 3, such that 
@g = trace(fz,3(Fr,)) (mod A) 
for almost all primes q. (Frq 1s explained below.) 


Our main purpose in this Lecture is to explain how this result follows 
from the following special case of Langlands’ Reciprocity Conjecture for 
Artin L-functions: 


Theorem 1.3. (cf. [Lal] and [Tu]). Suppose that the continuous represen- 
tation 


a: Gg — GL2(C) 


is “odd,” irreducible, and has solvable amage in PGLo(C). (Here odd means 
that if rT denotes complex conjugation in Gg, then det(a(r)) = —1.) Then 
there exists a normalized eigen-cuspform 


co 
9(z) = » ben 
n=1 


of weight one such that 
bg = trace(o(Fr,)) 


for all but finitely many primes q. 


As we shall see, the proof of this result requires working not only over 
an arbitrary number field, but also with automorphic cuspidal representa- 
tions (in place of classical cusp forms). Thus the second half of this Lecture 
will be devoted to recalling the basic representation theory required to re- 
formulate Theorem 1.3 as follows: 
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Theorem 2.6. For each irreducible representation 
a: Gg — GL2(C) 


which 1s odd and solvable, there is an automorphic “weight one” cuspidal 
representation of GL2(Ag), call it m(7), with the property that 


trace(t,, ) = trace(o(Fr,)) 


for almost every q. (Here tx, denotes the Langlands class in Glo(C) as- 
sociated to the unramified local component mq of T(7) = @Tp, and “weight 
one” means that-1.. ts the principal series representation of GLo(R) in- 
duced from the characters 1 and sgn.) 

§1. The Modularity of pz 3 

1.1. Galois Representations mod p 

Let & denote a fixed elliptic curve defined over Q. For a chosen prime 
p, let E[p] denote the subgroup of E(Q) consisting of points of order p. 


Then E[p] = FS, regarded as a two-dimensional vector space over Fy. The 
natural action of the Galois group 


Go = Gal(Q/Q) 
on E[p] consequently gives rise to a continuous representation 
PE,» : Gg — GLo(F,) = Aut(E[p)), 


which is uniquely defined up to its isomorphism class. That pz encodes 
much of the arithmetic of F is clear from the two following crucial properties 
of pr p: 
(a) Write 

Wp: Gg— FP 
for the character giving the action of Gg on the p-th roots of unity pp. 
Then 


(1.1.1) det(pz,p) = wp; 


this results from the existence of a “Weil pairing” E[p] x E[p] —> pp, 
compatible with the action of Gg and such that Ag (E[p]) ~ bp (cf. §V.2 
of [Silv]). 

(b) If g is any prime number, and Q is a prime of Q dividing gq, let Fr, 
denote the canonical Frobenius conjugacy class in Dg/Ig (the quotient of 
the decomposition group at Q by the inertia group at Q). Then 


(1.1.2) trace pr p(Fry) = ¢+1-—#(E(F,)) (mod p) 
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for almost all primes g, namely those where fz,y is trivial on (any) Ig (i.e. 
those q where fzp is unramified). 

N.B. (i) The invariants trace pgp(Fr,) (and also det pz p(Fr,)) are 
well-defined elements of F, precisely when pz, is unramified at q. 

(ii) The identity (1.1.2) essentially amounts to the Riemann hypothesis 
for elliptic curves over finite fields (proved by Hasse; cf. §V.2 of [Silv]). 

(iii) Alternatively, the primes g for which (1.1.2) holds can be charac- 
terized as those which are different from p and such that EF has “good reduc- 
tion modg.” Equivalently, let K be the kernel of fzp, and Q* := Q(E[p]) 
the corresponding finite Galois extension of Q; then 


Gal(Q(E|[p])/Q) * Impzpp, 
and (1.1.2) holds exactly for those q which are unramified in Q(E[p]) (equiv- 
alently, those g such that fz,» is trivial on J). 
1.2. The Modularity of pz» 
Let S,([o(N),€) denote the vector space of modular cusp forms f(z) of 
weight N > 1 and character « : (Z/NZ)* —+ C*. 
Definition. We call pz» modular if there exists some (normalized) eigen- 
form 


f(z) = So ane?™*"* € S2(To(N),€) 


n=1 
(for some N and e), and a prime A of Q containing p, such that 
ag =q+1-—#(E(F,)) (mod 4) 
for almost all primes q. 


Recall that Wiles’ goal was to prove that E itself is modular, i.e., for 
some weight two f(z) as above, the zdentzty (as opposed to congruence) 


ag =q+1— #(E(Fy)) 


holds for almost all g. As discussed elsewhere, what Wiles actually proves 
is Mazur’s “Modular Lifting Conjecture”: 


If p is a prime such that 

(i) Pzp is irreducible, and 

(ii) Pz,p is modular, 

THEN £ ITSELF IS MODULAR. 

More precisely, Wiles proves that (a) the Modular Lifting Conjecture 
is true for p = 3 and 5 when F& is a semistable elliptic curve, and (b) the 
Modular Lifting Conjecture for p = 3 and 5 already implies the Taniyama- 
Shimura-Weil Conjecture (that & is modular). 

Our modest goal is to explain how the theory of automorphic forms is 
used to prove that for p = 3, the second hypothesis of the Modular Lifting 
Conjecture automatically follows from the first, i.e., if Og 3 is irreducible, 
then it is modular. 

1.3. The Theorem of Langlands-Tunnell 
The crucial ingredient in proving the modularity of pz is the following: 
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Theorem 1.3. (cf. [Lal] and [Tu]) Suppose 
a: Gg — GL2(C) 


as a continuous, trreducible two dimensional representation whose image 
in PGLo(C) ts a solvable group. Suppose moreover that o is “odd” in the 
sense that 

det(o(7r)) = —1. 


(rt is an automorphism in Gg defined by complex conjugation.) Then there 
exists a (normalized) 


g(z) = >> bne?™"™* © 94 (Tp(N), v) 


n=1 


(for some N and w), such that f is an eigenform for all the Hecke operators, 
and 


331) by = trace(o(Fr,)) 


for almost all primes q. 


Remarks. (1) Because any continuous representation 
a: Gg — GLe2(C) 


factors through some finite Galois group Gal(A/Q), its image in GL2(C) 
is finite, and its image in PGLo(C) is just one of the symmetry groups 
of a regular polyhedron in R® (cf. section 13 of [Shaf]). From this it is 
deduced that the image of any irreducible o in PGL2(C) is either As (the 
icosahedral case), S4 (the octahedral case), A4 (the tetrahedral case), or 
Don, (the dihedral case). As we shall recall in §5.3, in the dthedral case the 
existence of the required weight one from g(z) above is essentially due to 
much earlier work of Hecke and Maass. Hence in dealing with “solvable” o, 
the theorem of Langlands and Tunnell is ultimately concerned with “just” 
the tetrahedral and octahedral possibilities. 

(2) The relevant theorems of [La] and [Tu] do not actually produce the 
required modular form g(z), but rather a certain automorphic representa- 
tion a(o). Using the fact that det(o(r)) = —1, we shall explain in §4.2 
of Lecture II how this automorphic representation produces g(z) itself. In 
the meantime, we take the above theorem as given, and use it to prove the 
modularity of fz 3- 


1.4. Proof of the Modularity of fz3 
More precisely, we need to prove: 
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Proposition 1.4. If pz,3 1s irreducible, then it 1s modular, 1.e., there exists 
a normalized eigenform 


2 


co 


f(z) = Ss) Qn ert 


n=1 
of weight one, and a prime X of Q containing 3, such that 
ag =q+1-—#£(F,) (mod A) 
for almost all primes q. 


The strategy of proefis simple. First one “lifts” pz 3 to a complex 
representation 0 : Gg —>+ GLe(C) to which the Theorem of Langlands- 
Tunnell is applicable; this produces a modular form g(z) of weight one 
whose Fourier coefficients b, are almost everywhere equal to trace(Fr,). 
Then one multiplies g by an Eisenstein series of weight one, whose (non- 
trivial) Fourier coefficients are all congruent to 0 mod 3; this essentially 
produces the required form of weight two whose Fourier coefficients are 
congruent to trace(Fr,) modulo some divisor of 3 (and hence also congruent 
tog+1—#8(F,), by virtue of (1.1.2)). 

Because of the importance of Proposition 1.4, we shall go through its 
proof carefully (expanding on the single paragraph allotted it in Chapter 
V of [W1]). We note that the idea of applying “Langlands-Tunnell” in this 
context goes back to Serre (cf. [Se], §5.3, page 220). 

Step 1. Extend fz3 : Gg —> GLo(F3) to a complex representation 


0: Go == GL2(C) 
by composing fz.3 with a specific (injective) homomorphism 
W : GLo(F3) ~ GLa(Z(v —2)) C GLe(C) 


described below. 
Following [RuSi], we introduce W directly through the formulas 


Ge gee 


and 
eb alee = 
| iy ams Ves ye es a 
—1 il 1 -l : 
Herea= | _ 1 0 and 6 = 1 1) 27 two convenient generators of 


GL2(F3). Once it is checked that the above formulas indeed preserve the 
required relations, it is immediately seen that the resulting homomorphism 


W : GLe(F3) —+ GLe(Z(VW—2)) C GL2(C) 
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is the identity upon reduction mod(1 + /—2). In particular, 


(1.4.1) trace(W(g)) = trace(g) (mod 1+ /—2) 


and 


(1.4.2) det(U(g)) = det(g) (mod 3 = (1+ Y—2)(1 — V—2)). 


N.B. This representation 
W : GLo(F3) —> GL2(C) 


is really just one of the (three) so-called cuspidal representations of the 
group GLo(F3); compare, for example, [PS1] §10. 
Step 2. Check that 


o=vV O PE3: Go —_ GL2(C) 


is “odd,” irreducible and solvable. 
Let us first check that pz3 itself has odd determinant. On the one 
hand, (1.1.1) implies 
det(pz,3(T)) = wa(r), 


and it is clear that w3(r) = —1. On the other hand, det o(r) is a priori 
+1, since r? = 1, and (1.4.2) implies det(a(7)) = det fz,3(7) (mod 3). So 
since —1 ¥ 1 (mod 3), we must have det(o(7)) = —1, as required. As for 
the “solvable” assertion, just recall that 


PGLo(F3) ~ Sa; 


this says that the image of o = Vo pg3 in PGLo(C) is a subgroup of Su, 
hence itself solvable. 

Now what about irreducibility? From the fact that det fz3(7) = —1, 
it follows that fz.3 has distinct eigenvalues in F3 (namely 1 and —1). We 
claim this implies pg3 is absolutely irreducible, i.e., irreducible over F3 as 
well as F3. Indeed, the only matrices in M2(F3) which can commute with 


Pe.3(Gg) (in particular Go is and some non-diagonal matrix pz3(9)) 


are the scalar matrices AJ themselves. Hence by Schur’s Lemma, fz 3 is 
absolutely irreducible. 

Now suppose that the complex representation 0 = Vo py is not ir- 
reducible. We claim this implies its image in GL2(C) must be abelian. 
Indeed, any complex representation of a finite (or compact) group is com- 
pletely reducible. In the case of a, this means o is the sum of two characters, 
and this clearly implies that its image in GL2(C) is abelian. 

On the other hand, as pz3 is absolutely irreducible, the only matri- 
ces commuting with its image in GLo(F3) must be scalar ones (again by 
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Schur’s Lemma). So pulling back through the embedding W, we conclude 
Pz3 has both an abelian and irreducible image in GLo(F3), an obvious 
contradiction. Thus o = Vo pg 3 must after all be irreducible. 

Step 3. Apply Theorem 1.3 to get a normalized eigenform 


9(z) = ye insome $,(I'9(M1),€1) 
n=1 


with 
(1.4.3) by = trace(o(Fr,)) for almost all primes g. 


Remark. Recall that for any normalized new form of weight k > 1 (and 
character w), the Fourier coefficients a, (together with the values ~(n)) 
lie in the ring of integers Ox of some number field K (of finite degree 
over Q). In particular, for our form g(z) above, it makes sense to discuss 
congruence conditions on the coefficients b, modula a prime ideal p of (the 
appropriate) Ox. 

Now recall that (1.4.1) implies 


trace(o(Frq)) = trace(W - o(Fr,)) 
= trace(fz,3(Frq)) (mod 1+ /—2) 
So by (1.1.2), (1.4.3), and the above Remark, we have 
(1.4.4) bg =Q+1-—#4E(F,) (mod p) 


for almost every g, and p some prime of Q containing (1+ /—2) (and hence 
3). In other words, we’ve proven that fz,3 is modular, but with the hitch 
that >> b,e?"'" is of weight one instead of two! To remedy this, we follow 


n=1 
two ideas, going back respectively to Shimura, and Deligne-Serre. 

Step 4. Pick a modular (non-cuspidal) form F& of weight 1, such 
that EB = 1 (3); the product 


g(z)E(z) = S Gee 
n=1 
is of weight 2 (for some level N and character w), and 
Cn = bp (mod p) 
(for p a prime of Q lying above 3...). 
Mose explicitly, take 


E(z) = F(z) =1+6 ye a x(d)e2"inz 


n=1 di|n 
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where 
0 if d=0 (mod 3) 
x(dq)=4 1  ifd=1(mod3) 
—1 ifd=-—1(mod 3) 


is an odd Dirichlet character mod 3 (i-e., x(—1) = —1). Then My € 
M,(T0(3), x), and gE£i, belongs to S2(T9(3N1),e1x). Furthermore, each 
“non-constant” Fourier coefficient of E,,, is divisible by 3 (in fact 6), so it 
easily follows that 

Cn, = b, (mod p). 


Note that 9(z)E(z) is the product of a normalized eigenform with an 
Eisenstein series, but not itself such an eigenform. If it were, we would (by 
(1.4.4)) have completed our task of proving gz3 modular. To finish the 
job, we need the following result of Deligne and Serve: 


Lemma. (cf. §6.10 of [DS]) Suppose 
a (z) = Ss) Creer 
n=l 


is a normalized element of S,(To(N), 1), and K is a finite extension of Q 
whose ring of integers contains the coefficients c, and y(n). Suppose p is 
a prime ideal of Ox containing 3, and that f, is a mod p eigenform, i.e., 
there exists b, such that T,f, — o,f; = 0 (mod p) for alln. Then there 
exists an f in Sy(To(N),w), and d,, such that for alin, 


Trf =dnf, 


and 


dy, = Cp, (mod p’) 
for some prime p’ dividing p. 


We want to apply this Lemma to our modular form 9(z)E,(z) of 
weight 2. Since the constant term in the Fourier expansion of F(z) is 
1, this gE is indeed normalized, i.e., c) = 1. Let K be its corresponding 
number field with prime ideal p (dividing 3). Since E = 1 (mod 3), we have 


Lah — bafi (mod p) 


(since Tg = bng) for all n. Thus the Lemma is indeed applicable (taking 
fi = 9(2) £1 (z)), and it produces a normalized form f € So(To(V), w) such 
that T,f = apf for all p, and ay = cp = bp (mod p). In particular, for 
almost all q, 

@g =Q+1-#E(F,) (mod p’), 


as required. 
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§2. Automorphic Representations of Weight One 

2.1. For o an irreducible, “odd,” solvable two-dimensional representation 
of Gg, the relevant results of Langlands and Tunnell do not directly pro- 
duce the required modular form g(z), but rather — as already suggested 
— acertain automorphic cuspidal representation a(a), which is shown to 
correspond to such a form g(z). An honest explanation of how this works 
requires (at least) the exposition of representation theory given in the pages 
below. Roughly speaking, a classical eigen-cusp form g, of weight one, and 
fixed level and character, generates a collection of irreducible representa- 
tions 

{Tp }p<co 


of the “local” groups GL2(Q,) — each of which is uniquely determined by 
the data attached to g(z); moreover this collection comprises what is called 
an automorphic cuspidal representation nm of the adele group GL2(Ag). 

Once the notion of “automorphic form” is liberated from its classical 
(upper half-plane) setting, it seems completely natural to take the further 
step of replacing Q by an arbitrary number field F’, and GL by any GL,, 
n > 1. Thus one ultimately views “Dirichlet characters mod n,” Hecke 
characters of a number field, classical modular forms, and even “Maass cusp 
forms” (cf. Remark 2.5.5) as manifestations of one and the same kind of 
global object, namely, an automorphic representation of GL, over a number 
field. It is this language which Langlands used to formulate the following 
general Langlands Reciprocity Conjecture (LRC): for any n-dimensional 
representation o of Gal(F'/F') (or more generally the Weil group Wr), 
there is a corresponding automorphic representation 


U(7) = Buty 
of GL, (Af) such that for almost all the primes v of F, 
trace o(Fr,) = trace(tz, ) 


(with t,, the Langlands class of 7, in GL,(C)); moreover, if o is irreducible, 
then a(c) is cuspidal. 

Our main goal in Lectures II and III will be to describe the ideas 
behind the statement and the proof of this “Strong Artin Conjecture” in 
case n = 2 and a is solvable, namely: 


Theorem 2.1. To each irreducible 

a: Wr — GL2(C) 
with solvable image in PGLo(C), there corresponds an automorphic cus- 
pidal representation n(o) = @n, of Glo(Ar) with the property that (its 


central character equals det o and) 


trace(t,,) = traceo(Fr,) for almost every prime v of F. 
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Our more modest goal in the rest of this Lecture is to explain how 
Theorem 2.1 gives the required classical result (Theorem 1.3) when F = Q, 
o factors through the Galois group Gg, and det(a) is odd. One step is to 
show that for such o, the corresponding a(o)’s are “automorphic represen- 
tations of GL2(Ag) of weight one.” This step will actually be postponed 
until we discuss the correspondence 0 —+ x(a) in earnest in Lecture II, 
cf. Proposition 4.2. A second step is to show there is a one-one correspon- 
dence between these “automorphic representations of weight one” and the 
classical new forms of weight one required in Theorem 1.3 (cf. Proposition 
20) 

a Archimedean Representation Theory (GL2) 

Let G = G. denote GLg(R), and g its complexified Lie algebra. Let 
K = Ka = O(2,R) denote the real 2 x 2 orthogonal group, a maximal 
compact subgroup of G. 

Definition. An admissible representation of G on a Hilbert space H is a 
homomorphism 


n:G— GL(H) 


such that (i) the map (g,v) —> m(g)v from G x H —- H is continuous, 
and (ii) (“admissibility”) the restriction of « to K contains each irreducible 
unitary representation of K with finite multiplicity (recall that each such 
representation is automatically finite dimensional). 


Remark. For an admissible representation 7 : G —-+ Aut(H), let V = 
Vx C H denote the subspace of K-finite vectors, i.e., those v in H whose 
translates under a(K) span a finite dimensional space. Such vectors are 
not preserved by the action of 7(G), but they are by the corresponding (dif- 
ferentiated) action of the Lie algebra g of G. In fact, as a representation 
space jointly for the action of g and K, V enjoys certain “compatibility” 
properties which ensure that it is (what’s called) a (g,K)-module. The 
advantage of these modules is that they are “algebraic” linear objects as 
opposed to “analytic” Lie group theoretic objects, and yet they accurately 
reflect the nature and properties of 7. For example, 7 is irreducible in the 
usual sense (that H has no Hilbert space subspaces invariant under 1(G)) 
if and only if V is irreducible in the algebraic sense that it has no vector 
space subspaces invariant under both w(K) and da(g)). This leads to an 
equivalence between the natural categories of irreducible admissible repre- 
sentations of G and irreducible (g,K) modules. In particular, whenever 
convenient, we allow ourselves to confuse a with the (g, K) module V. 


Example 2.1. Let py, and po denote two characters of R™ (i-e., not nec- 
essarily “unitary” homomorphisms from R* to C*). Let H = H(t, 2) 
denote the Hilbert space of functions f : G —> C such that 


£((G@ EZ )a)=| S| mcerraatanye(o 
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a1 = 
0 ag 
matrices of G, and such that 


) in the Borel subgroup B consisting of upper triangular 


FI? = | \ooPae < 00. 


(Strictly speaking, one first looks at continuous such f, and then defines 
H(p1,p2) to be the closure of such functions with respect to the norm 
| f||....) Then the right regular action of G on functions f in H(p, pe) 
defines an admissible representation of G which is denoted m(y, 2) and 
called the representation of G induced from the character pipe (of B). The 
corresponding (g,K) module Vx consists of finite linear combinations of 
smooth functions ¢, defined by 


ok (9) = pi (a1)H2(a2)e**? 


if g has “Iwasawa decomposition” 


9= & = ) 


cos? —sin@ 
sinO cos 
an obvious compatibility condition coming from the elements +/ in BNK, 
namely ji1j42(—1) = e~**™.) From this picture it is a simple matter to prove 
that 1(f1, U2) is irreducible if and only if wipy'(z) # x? sgn(x) with pa 
non-zero integer. Slightly less transparent is the proof of the following: 


(Here r(@) denotes the rotation element in K, and there is 


Fact. Every irreducible admissible representation a of G is (equivalent 
to) a (41,2), or an irreducible subrepresentation thereof. For example, 
suppose [1/45 '(z) = z? sgn(xr) with p a positive integer. Then H(j11, U2) 
contains exactly one invariant subspace, namely 


{. .- Pines; P—p—1; Pp+1s Pp+3; 7+ 1? 


and the restriction of 1(1, 42) to this subspace realizes an irreducible dis- 
crete series representation of “lowest weight p+ 1.” 


Concluding Remarks. (a) All the above notions, and most of the results, 
hold for an arbitrary semisimple or reductive Lie group; in particular, every 
irreducible admissible representation of such a group is still realizable as 
a subrepresentation of the analogous induced representation. (The theory 
for GL2(R) is essentially Bargmann’s, and the general theory essentially 
due to Harish-Chandra; for many more details, see the recent survey paper 
of Knapp [Kn].) 
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(b) For any irreducible admissible representation 7, an analogue of Schur’s 
Lemma implies that there is a character of R*, denoted w, and called the 
central character of 7, such that 


reek 
a¢ p) = ented for all r € R*. 


2.3. P-adic Representation Theory (GL2) 

In this Section, F is a p-adic field with a ring of integers Or, and G = 
GL2(F). As in the real case, most of the notions and facts we review 
below extend not only to GL,, but to an arbitrary reductive p-adic group; 
however, we recall here only those facts (even for GL.) which are really 
needed in the sequel. 

Definition 2.8.1. An admissible representation of G on a vector space V 
is a homomorphism 


n:G—+GL(V) 


such that (i) the stabilizer in G of any v in V is open, and (ii) (“admissi- 
bility”) for any compact open subgroup K® Cc G, the space 


{uEV: r(k)v =v for all k € K°} 


is finzte-dimensional. 

Remark. Suppose a is an irreducible unitary representation of G in some 
Hilbert space H (same definition as in the case of Lie groups), and Vy is its 
subspace of K-finite vectors (for any compact open K, say K = GLo(OF)). 
Then (by a Theorem of J. Bernstein) the restriction of 7(G) to V = Vx 
produces an admissible representation of G in the above sense. Thus the 
p-adic notion of an admissible representation (on a vector space V) is a 
natural analogue of the archimedean notion of a (g,K) module. What’s 
special in the p-adic case is that G itself acts on Vx (and there is no need 
to go to the Lie algebra... ). : 


Example (of m(u1,p2)). For each pair of characters p; : F* —-+ C%, 
the induced representations m(j:1, 2) in H(t, 2) are defined just as in 
the archimedean case. But now reducibility is possible only if wi4y5'(z) = 
|r|*!, i.e., there is no room for many “discrete series” representations to 
appear as subrepresentations of m(j11, u2). This reflects the fact that there 
are now representations which are absolutely cuspidal, i.e., they cannot be 
constructed in this simple way. 


Fortunately, we shall not need to discuss the cuspidal representations 
in these Lectures, but rather only those representations which are as far 
from being cuspidal as possible! 

Definition 2.8.1. An irreducible admissible representation a of G is un- 
ramified (or of conductor zero) if its space of K = GL2(O-f) fixed vectors 
is non-empty. 
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In this case, it is known that the space of K-fixed vectors is one- 
dimensional, and that 7 is either one-dimensional (of the form y o det, for 
some unramified character x of F'™), or an irreducible m(11, w2) with yw and 
jig unramified characters of F'~. As we shall recall in the next paragraph, 
it is these latter representations which typically play a crucial role in the 
adelic theory of automorphic representations. 

Concluding Remark. Even if some a does not have “conductor zero,” 
there will still exist a smallest (positive) integer N (called the conductor 
of w) such that the space 


fwevin(’ i) v= walae for allk ¢ KN} 


is non-empty. (Here K% denotes the Hecke congruence subgroup of K 


consisting of : ) with c=0 (mod p”).) This fact was first proved in 


[Cas], and then generalized to GL, in [J-PS-S1]; if N = conductor(), then 

it is also known (as in the case N = 0) that the space V* “ is automatically 

one-dimensional. 

2.4. Adelic Representations (GL2) 

This is mostly a matter of putting together the local representation theory. 
Suppose we are given a collection of irreducible admissible representa- 

tions {7} of the local groups G, = GL2(Qp), such that 7, is unramified 

for all but finitely many p. Then the restricted tensor product 


&) ™ 


poco 


makes sense aS a representation of the restricted product GL2(Ag) = 
II’ Gl2(Q,), and defines an irreducible “admissible” representation m in 
poo 


a sense that can be made precise; cf. §4.c of [Gel] or §9 of [JL] for more 
details. Conversely, any irreducible “admissible” (and in particular any 
irreducible unitary) representation 7 of GLa(Ag) is uniquely factorizable 
as 


T = QTp, 


with almost every 7, unramified. Moreover, the obvious analogues of these 
statements hold for an arbitrary number field F. 
Example G = GL. In this case, we consider a collection of characters xy 
of F(X, one for each prime of F’, such that almost all the y,’s are unramified, 
i.e., trivial on O*. Then 
x=[[x 
Uv 


defines a nice (continuous) character of the ideles At = GL) (Ap), and ev- 
ery such character thus arises (cf. Proposition 7-1-12 of [Gold]). Of course, 
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in number theory one is primarily concerned with characters x of AZ which 
are trivial on F'™, i.e., with “grossencharacters,” or characters of the idele 
class group 


or, (pNCLAD) = px\AX 


These are the “automorphic representations” of GL(1) over F’. 


Special Example. Since Q has class number one, 


(*) AS =Q*-Rt- J] 2%. 


p<oo 


Thus-a Dirichlet character y : (Z/NZ)* —+ C* determines a grossenchar- 
acter xy as follows: for any p < oo,w can be pulled back to a character 
Xp of Z through the canonical homomorphism from Z* to (Z/NZ)*; the 


product |] xp then defines a character of |] Z}, and hence by («) a 
p<oo <oo 


character yy of Ag trivial on R* as well as Q*. In this way one obtains a 
grossencharacter of AS of finite order, and all such grossencharacters arise 
in this way for suitably large N. (Note that x, is unramified for the primes 
p not dividing N.) 

Following this lead for Gh, it is clear that one should define automor- 
phic representations for G = GL, in terms of the quotient space 


GL,.(F) \ GLn(A). 


For simplicity, we give a precise definition only for G = GLe and cuspidal 
automorphic representations. 

Definition. Fix a grossencharacter w of F’, and let L2(G(F’) \ G(A),w) 
denote the (closure of the) space of all continuous y : G(A) —> C such 
that 

(i) o(19z) = w(z)¢(9) 


for all + € G(F) and z € Z(A) = iG °) (the center... ); 


(ii) Ip(9)I?dg<co; and 


Z(A)G(F) 
(iii) ~ is cuspidal, i.e., for any g in G(A), 


Ao((a i))e-8 


Then an irreducible admissible (necessarily unitary) representation 7 = 
@n, of G(A) is called automorphic cuspidal if there is some w such that 7 
is equivalent to an irreducible summand of the right regular representation 
of G(A) in L2(G(P) \ G(A), w). 
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Remarks. (i) In a completely similar way, the notion of an automorphic 
cuspidal representation a can be defined for G = GL, and more generally, 
for an arbitrary reductive algebraic group G. In this generality, it is a The- 
orem of Gelfand and Piatetski-Shapiro (cf. [GGPS]) that the right regular 
representation Rg of G(A) in L2(G(F) \ G(A),w) decomposes discretely, 
and with finite multiplicities, i-e., 


fete BD man, with mz < oo. 


For G = GLn, it is actually known that this multiplicity is one (cf. [JL], 
[GK], and [Shal]), but for other groups this need no longer hold. For general 
G, it is at least still true that any automorphic cuspidal representation has 
a factorization of the type 

T= @QTly 


discussed above, with 7, almost everywhere an unramified representation 
of G(F,). 

(ii) There is also the notion of an automorphic (not necessarily cuspidal) 
representation of G(A), but as we shall not focus on these in the sequel, 
we refrain from giving a precise definition. Suffice it to say that such 7 are 
the irreducible admissible representations of G(A) which are built out of 
cuspidal representations of the Levi components of G by way of induction 
(and taking of quotients); for example, for GL2, the automorphic (non- 
cuspidal) representations are the quotients of the induced representations 
(11, #2) with w, and ye grossencharacters of F' (viewed as “cuspidal rep- 
resentations” of the diagonal subgroup of GL2). For a thorough discussion, 
see [BoJa] and [La2]. 


2.5. A Dictionary (Between the Classical and Modern Theories 
for GL) 

In the last paragraph, we recalled how classical Dirichlet characters cor- 
respond to certain automorphic representations of GL(1), namely those 
grossencharacters which are of finite order. We now describe an analogous 
result for GL(2). 


Proposition. There is a 1-1 correspondence between normalized new 
forms 


oo 
f(z) = Ss; Gn errr 
n=1 
in S,(To(N),), and irreducible automorphic cuspidal representations 
T = @pMp 


of G(A) such that: 
(a) the central character of 7 is xy, 
(b) Tp is unramified for all p[N, and 
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(c) Too has “lowest weight k.” 
Moreover, if N = [| p$*, and p|N, then mp has conductor a;; but if p{N, 


then T, = T(t, f2), with 1,42 unramified characters of QF such that 


(2.5.1) p® (141(p) + H2(p)) = ap. 


Remark 2.5.2. In case k > 1, then a, is a discrete series representation 
of “lowest weight k,” sitting inside some 7oo(f1, 2) (cf. Fact 2.2); this 
discrete series representation has central character trivial on Rt, and is 
denoted D;,. However, if k = 1, then 1,, is not a subrepresentation of any 
Too(f1, 42), but rather equals the full induced (“principal series”) repre- 
sentation oo (1,sgn) (see the proof below). In this case, the corresponding 
representation 7 = @mp (with 7144 = Moo (1,8gn)) is called an automorphic 
cuspidal representation of weight one. 


Corollary of the Proposition. A normalized new form in S,(To(N), wv), 
with eigenvalues {ap}, is one and the same thing as an automorphic cusp- 
idal representation @1p of weight one such that (cf. (2.5.1)) 


Ap = fi (P) + pe(p), for all p{N. 


N.B. Here 1 and jz are the two unramified characters of Q* inducing 
the unramified representation 7) = 7p(f1, 2) of GLo(Q,). From now on, 
we rewrite the above relation in the more suggestive form 


(2.5.3) Gp = trace (tz,) 
with 
fas Ge ee in GL2(C) 


the so-called Langlands class of Tp. 


Sketch of the Proof (of the Proposition). The first step is at the level of 
functions. Using the decomposition 


G(Ag) = GL2(Q) GL3(R) [| KX, 


p<oco 


(analogous to (*) for GL), one defines for any f in S,(To(V), W) a function 
ys on G(Ag) by 


(2.5.4) 4 (Y9c0k0) = F(Goo(t))F (Goo, 1) *xw(ko)s 


KN = te ‘) € Ky = GL2(Zp) : c= o(W)} 


and 


172 S. GELBART 


F(Goo, 2) = (cz + d)(det Yoo)? it Gg = é . ; 
It is now a standard matter (cf. [Cas] and [De]) to check that this map 
f — oy; is an isomorphism from S,(To(NV),~) to the space of smooth 
functions {py} on GLe(Ag) such that 
(i) o(yg9) = v(g) for all g € GL2(Q); 
(ii) p(gk) = (xu) (Ak) for all k € Ke 
(iit) y(gr(8)) = e~**9x9(g) for all r(8) = (85 ~E2h) € Coos 
(iv) p(zg) = xw(z)e(g) for all z € Z(A), 
(v) If X denotes the “pushing down” differential operator corresponding 
ac 


to the element C ) of g, then 


X-p=0 


(this is the condition reflecting the holomorphy of f); 

(vi) y is of “moderate growth” on G(A) (relecting the holomorphy of f 
“at the cusps”); and 

(vii) y is cuspidal, i-e., 


Ina ((0 :)) a 


(reflecting the cuspidality of f...). 

Conditions (i)-(vii) imply that y; belongs to L2(G(Q)\G(A), xX): 
The second step of the proof is then to show that w+; generates (under 
right translation by G(A)) an irreducibly invariant subspace mf (of Lo(x)) 
whose corresponding representation 7 = @mp is as claimed; one must also 
show that all such representations are thus obtained. These arguments are 
explained in §5 of [Gel], but only really for the case k > 1, where 71g is the 
discrete series representation D,. In case k = 1, one must argue as follows 
(in order to identify 7..). 

Suppose pi(x) = |x| sgn(z)*, and mo. = m(f1, 2). Then H(p1, po) 
consists only of functions ¢, with k of the same parity as €) + €2. (In par- 
ticular, 7(1,sgn) consists only of “odd” functions....) A straightforward 
computation (4 la Bargmann...) also shows that 


X- On = (GBS -5) 


2 
and 
r 0 Sedu 
OE G =riT26.(9), forr>0. 
Or 
This means that if wz, is to be trivial on Rt, we must have s] = —s_ = 5, 


and if X - @, is to be 0, we must have s; = sg = s =0,ie., 74. = m(1,8gn) 
as claimed. 
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Concerning the converse direction, we recall the following. Suppose 
= @Mp is an irreducible subrepresentation of L3(G(Q \ G(Ag), xu) (for 
some grossencharacter x,y, of finite order), and 7, is “of weight k” (kK > 1). 


If the conductor c(m) = [J c(mp) of a is N, let y, denote the function in 
p<oo 


the space H,, of a which is right KN invariant for all p. (This function is 
uniquely determined up to a scalar; cf. the Concluding Remark of §2.3.) 
Then via the correspondence y, —~> f, (inverse to (2.5.4)), we obtain from 
nw the required new form of weight k. 

Remark 2.5.5. The fact that the principal series 7(1,sgn) leads to a holo- 
morphic form f(z) (of weight one...) relies on the critical confluence of 
conditions 


Xo =0 


and . 
$(gr(8)) = e-74(9). 
Indeed, these two conditions together imply that the function f defined by 


f(z) = $(ge0(#)) I (Goo, 4) 


will be holomorphic on the upper half-plane h. The delicacy of this point 
can be appreciated by examining what would happen if we took 7, to be 
the principal representation m(1,1) (or m(sgn, sgn)) in place of 7(1,sgn). In 
this case, only ¢,’s of even parity occur in H(j11, 2), X - de = (45®) oe 
is never zero, and no holomorphic f’s arise in h. The crucial point now is 
that ¢o will be K..-~invariant, and hence directly define a function f(z) on 
h, with the property that 


0? o? 
Af = -y? (sa+ za) f =1/4f. 


Indeed, let D denote the standard Casimir operator in the center of the 
universal enveloping algebra of g, which for K..-~invariant functions corre- 
sponds exactly to the Laplace-Beltrami operator A above. Then the action 
of D in H(y1, pe) (with po(x) = |z|* sgn(z)*) is given by the formula 


= 2 
Dagar Sa) : ok 


(cf. Lemma 5.6 of [JL], page 166, keeping in mind that our Casimir is 1/2 
theirs). Thus the same reasoning as used above (to show that an auto- 
morphic cuspidal representation @a, with 7. = a(1,sgn) corresponds to 
a classical cusp form of weight 1) shows also that a cuspidal representation 
Qty of GLo(Ag) with To. = m(1,1) (or m(sgn,sgn)) corresponds to a Maass 
cusp form of “eigenvalue 1/4.” 
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2.6. Reformulation of Theorem 1.3 
Suppose 
a: Gg — GL2(C) 


is a continuous, irreducible, two dimensional “odd” representation whose 
image in PGL2(C) is solvable. Then there exists an automorphic cuspidal 
representation 1(7) = @p%p of GLe(Ag) which is of weight one, central 
character det a, and such that for almost all p, 7) = m(11, 2) is unramified 
with 


(2.6.1) trace o(Frp) = trace(tx,) = 1 (p) + p2(p). 
Remarks. (2.6.2). Recall that the matrix 


es no) ae 


is the Langlands class in GLo(C) attached to the unramified representation 
ge 

(2.6.3) According to Corollary 2.5, the existence of an automorphic cusp-~ 
idal 7 = @mp as above implies the existence of a new form f = > a,e27™ 
in some S1(['o(N), w) with 


Gp = trace o(Frp) 


for almost all p. Thus this representation theoretic reformulation of Theo- 
rem 1.3 indeed implies Theorem 1.3. 

(2.6.4) In the next few lectures we shall explain how the more general 
Theorem 2.1 is proved; this will imply Theorem 2.6, in case F = Q, o 
factors through Gg, and deta is odd, for it is a simple matter to see that 
m(a) is then actually of weight one, i.e., to. = m(1,sgn) (cf. Proposition 4.2 
of Lecture II). 

(2.6.5) “Strong Multiplicity One” for GL(2) asserts that two automorphic 
cuspidal representations 7 and 7’ are equivalent as soon as they are equiv-~ 
alent almost everywhere, i.e., 


Tp =, for almost all p. 


This fact is explicitly used, together with multiplicity one for GL(2), in the 
proof that a new form f generates an irreducible subspace my of LZ. In the 
statement of Theorem 2.6 (or 2.1), it also implies that a(c) (if it exists at 
all) is unique. Indeed, once the central character is fixed, condition (2.6.1), 
which holds almost everywhere, uniquely determines 7). Similarly, in the 
classical version (Theorem 1.3) of Langlands-Tunnell, the new form g(z) 
is uniquely determined by the condition (1.3.1) which fixes its eigenvalues 
almost everywhere; this reflects the fact that the theory of new forms is 
one and the same thing as the strong multiplicity one result coupled with 
the notion of conductors! (See [Cas] or [Gel] for a further explanation of 
this point.) 
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Lecture II 
The Langlands Program: Some Results and Methods 


Abstract 

We start this lecture by describing the Local Langlands Conjecture (LLC) 
for GL(n) over a local field F. In case n is a prime, this is a Theorem 
over any field F’, known as the “Local Langlands Correspondence.” Thus 
we can (and will) describe the resulting correspondence in some detail 
for n = 2, and apply it to refine (and generalize) the two-dimensional 
Langlands Reciprocity Conjecture (LRC) as follows: 

To each continuous irreducible representation 


0: Wr —_ GL.2(C) 


of the Weil group of a number field F’, there is associated an automorphic 
cuspidal representation 1(a) = @m, of GLo(Ar) with the property that 


Ty <—> Oy 


for every place v. (Here oy, a two-dimensional representation of the local 
Weil group Wz,, is the “Langlands parameter” of the corresponding rep- 
resentation 7, of GLo(F,); for almost every v, oy, and 7m, are unramified, 
and the relation 0, —> 1m, reduces to the more familiar relation 


Oy (Fry) ms bre, 


As we shall see, the “classical version” of the Langlands-~Tunnell Theo- 
rem (Theorem 1.3 of 2.6) follows immediately from the proof of the general 
Reciprocity Conjecture in the solvable case; indeed, when F = Q, and a 
factors through Gg and is “odd,” «(a7) must be automorphic of weight one, 
i.e., (co) = Too(1, sgn) (cf. Proposition 4.2). 

In the second half of this lecture, we also begin to collect the automor- 
phic results required for proving the global LRC in the two-dimensional 
solvable case. As we shall see, all these results, as well as the LRC itself, 
are but special realizations of a “Principle of Functoriality with respect to 
the L-group,” namely: 

Langlands’ Functoriality Conjecture 

Given two reductive F-groups G and G’ (with G’ quasi-split), and a 

morphism 
| ge Cee a Cs 


between their L-groups, there is a corresponding mapping of automorphic 
representations 
nm —+ 1(p) = Or, 
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from G to G’, such that for almost every uv, p takes the Langlands class ¢,,, 
in “G to the Langlands class te! =te,(p) in a Cas 


§3. The Local Langlands Correspondence for GL(2) 

3.1. The Archimedean Case 

We assume F' = R. (For the simpler case of F' = C, which we do not need, 
see Remark 3.1.2 below). In this case, the Wel Group Wr is an extension 
of C* by Z/2Z given by 


We =C* VU jC", 
where j7-= —1; and jcj_1+-— €, and the natural surjection 
yp: Wr — Gal(C/R) 


is given by y(C*) = 1 and y(7C*) = 7 (complex conjugation). We are 
interested in the set of equivalence classes ®(GL, /R) of n-dimensional 
complex representations 0 of Wg whose images consist of semisimple ele- 
ments in GL,,(C). 

Example 3.1.1. The one-dimensional representations of Weg are of the 
form yp ~ (t,e), taking z in C™ to |z|%, t EC, and 7 toe = £1. (Indeed, if 
u(j) = w, then on C*, u(Z) = w(jzj—!) = we(z)w! = u(z) = 2*2% with 
=e en (2) Sr = [el also (= pg) a) 
On the other hand, the two-dimensional irreducible representations of Wp 
are all induced from some character 


Zz ™m 
ae (5) 


of CX, with ¢ arbitrary in C, and m > 1 an integer. Clearly these represen- 
tations are “semisimple.” It is also easy to show that every n-dimensional 
semisimple representation o of We is a direct sum of these one and two- 
dimensional irreducible representations. 


Theorem. The Local Langlands Correspondence for GL,(R). There is a 
well defined bijection 
a +— n(c) 


between ®(GL, /R), the set of classes of n-dimensional semisimple complex 
representations o of Wr, and II(GL, /R), the set of classes of irreducible 
admissible representations m of GL,(IR); moreover, the L and € factors 
assigned to o and m are preserved by this correspondence. 


Remarks. The existence of this correspondence, formulated and proved 
more generally by Langlands for an arbitrary reductive Lie group, is the 
subject matter of [La3]; the fact that L and € factors may be defined for o 
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and a in the context of GL,, and then preserved by this correspondence, 
is discussed in [Jal]. 

Example of GL2(R). Suppose first that o is the sum of two one-dimen- 
sional representations (i.e., characters) py; ~ (t1,€,) as in Example (3.1.1). 
Then 7(c) is taken to be the unique irreducible quotient of the (induced rep- 
resentation) m(j11, #2), where pi(z) = |x|; (sgn(x))*, and the order of ¢, tz 
is arranged so that Re(t,) > Re(t¢2). For example, if o = (3,0) @ (—3,0), 
then 7(c) is the trivial representation, whereas if a = (0,0) @ (0, 1) (resp. 
(0,0) ® (0,0)) then a(c) is the irreducible principal series representation 
a(1,sgn) (of “lowest weight 1”) (resp. the class 1 principal series repre- 
sentation m(1,1), with Casimir eigenvalue \ = —}). On the other hand, 
‘suppose now that o is the irreducible two-dimensional representation of Ex. 
3.1.1, with parameters ¢t and m > 1. Then z(c) is taken to be the discrete 
series representation Dm+1 @ | det( )|—, with Dn+, of lowest weight m+ 1 
and trivial central character. 

Remark 3.1.2... In case F = C, the Weil group is just C* and each n 
dimensional semisimple representation o is just a sum of characters yu; of 
the form ()™ zl@ with m; € Z. In this case, there are no discrete series, 
and to each o as above, the corresponding 1(c) is just the unique irreducible 
quotient of Ind yi) p2--- pn, with the pi’s arranged so that 


Re(t,) >--- > Re(t,). 


3.2. The p-adic case 


In case F' is a p-adic field, its Weil group Wr is a dense subgroup of 
Gal(F'/F), equipped with an isomorphism 


FX we. 


In particular, the one-dimensional (complex) representations of Wr are 
again identified with the irreducible admissible representations (i.e. char- 
acters) of F'* = GL(1, F), just as in the archimedean case. However, unlike 
in the archimedean case, there are now irreducible representations of Wr 
of arbitrary dimension (reflecting the existence of extensions of F of ar- 
bitrary degree...). This fact considerably complicates the representation 
theory — and concommitent local Langlands correspondence — for GL(n). 
Fortunately, for our purposes, we don’t need to describe the full Langlands 
correspondence; instead, we need only the following: 


Theorem 3.2. For each two-dimensional “semisimple” representation o 
of Wr, there is exactly one irreducible admissible representation n = m(c) 


(3.2.1) Wy () = 1 (5 2 = det o(a)I, 
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and such that for all characters x of F*, and wr of F, 
L(s,7 @ x) = L(s,o @x), 
L(s,t@x~*) = L(s,4@x"), 
e(s,7 @X, Vr) =e(8,0 @X, vr). 


Moreover, all irreducible admissible w thus arise. (Here the L and € fac- 
tors on the left-hand side are those of Jacquet-Langlands, and those on the 
right-hand side the local factors of [La4]; “~” denotes the contragredient 
representation.) 


Remark 3.2.2. The existence parts and exhaustion of the Theorem are 
easy, except for the case of irreducible o (which is due — for arbitrary F 
— to Kutzko ([Kut]). The uniqueness part is Corollary 2.19 of [JL], and 
the resulting bijection 

a«<— n(c) 


amounts to the Langlands correspondence for GL(2). 
Caution. Missing in the image of the map 


a — 7c) 


just described are the “special representations” of GLo(F’). Although they 
can be obtained by considering representations of the Weil-Deligne group 
Wy, in place of Wp (see, for example, [Ta] or [Kud]), we prefer to ignore 
these representations as they play no crucial role in the sequel. In fact, 
for the global applications we have in mind to the Reciprocity Conjecture, 
it is crucial to make explicit only the following unramified part of the 
correspondence. 


Example 3.2.3. Recall that if k denotes the residue field of F, then Wr 
consists of those elements of Gal(F'/ F) whose image in Gal(k/k) is an inte- 
ger power of the Frobenius automorphism generator of Gal(k/k). Thus the 
inertia subgroup J of Gal(F'/F) is contained in Wy, and a representation 
ao: Wr —> GL2(C) is called unramified if it is trivial on I. In this case, 
since I \\Wr & Z (integral powers of the generator of Gal(F'/F)), o is 
completely determined by where it takes the (class of a) Frobenius element 
Fr of Wr. So suppose (after conjugation, if necessary) that 


o(m) = (7 * “aa in GL2(C), 


with s1,82 € C. Then the corresponding representation a(o) of GLo(F) 
will be the unramified induced representation m(j1, ue), with wi(z) = |z|*, 
i.e., the Langlands class 


xo) = Gy wal) ) 
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will be conjugate to o(Fr). (More precisely if w(1, u2) is itself reducible, 
then 7(c) will be the unique irreducible unramified quotient of m(111, 2), 
perforce one-dimensional. . . .) 


§4. The Langlands Reciprocity Conjecture (LRC) 

4.1. Reformulation of Theorem 2.6 (“Langlands-Tunnell” ) 

For F’ a global field, the Weil group Wr maps surjectively onto the Galois 
group Gal(F'/F'’), and there is a canonical isomorphism 


x 
We = ox Ar, 


For each place v of F’, there is also an injection Wr, —> Wr, defining a 
map 


og — Oy 


from the two dimensional semisimple representations of Wr to those of 
We, (cf. [Ta] for background). For a given o, almost all the resulting o,’s 
will be unramified, and these unramified o,’s uniquely determine o. (This 
is “strong multiplicity one” on the “Galois side.”) 

Using the local Langlands correspondence for GL(2), we can now at- 
tach to any nice 0 : Wr —> GlLo(C) a global representation (co) of 
GL2(Ar), namely 

t(o) = @n(oy). 


The thrust of the Conjecture below is that this a(a) must be automorphic. 
Conjecture (LRC). Suppose 
[ome Wr —_ GL2(C) 


as irreducible. Then there exists an automorphic cuspidal representation 
T= @ny of GLo(Ar) such that 


Ty = (oy) for all v. 


(In particular, the Hecke-Jacquet-Langlands L-function 
L(se3)'= []46. Ty) 


attached to x — which is known by [JL/ to be entire — will equal the Artin 
L-function L(s,o) = [],, L(s,ou).) 


Remarks. (1) This conjecture is actually equivalent (via the “converse 
theorem” for L-functions on GL(2)) to Artin’s conjecture for two dimen- 
sional irreducible o; cf. 5.3.1 below. Thus this LRC is sometimes called the 
“Strong Artin Conjecture.” 
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(2) Ifo is reducible and the sum of two grossencharacters y, and po, then 
there is easily seen to be an automorphic (non-cuspidal) representation 
n = @n, of GL(2,Ar) with a, = n(o,) for all v, namely the induced 
“Fisensteinian” representation 7(“1, 2) (or appropriate irreducible quo- 
tient thereof). 

(3) When F = Q, and o : Wg — GL2(C) factors through Gg, the above 
form of the LRC clearly implies the “almost everywhere” version which we 
stated in §2.6. According to the Proposition below, these two forms of the 
LRC are actually equivalent! 


Proposition 4.1. Suppose o is a two-dimensional representation of Wr, 
and m is a cuspidal representation of GLo(Ar). Then 


Ty =T(oy) forallv 


if and only if 
trace(t,,,) = trace(o,(Fr,)) 


for almost all v (where both ty and o, are unramified). 


Note that this last condition really says that, for the unramified places, 
Ty = Ty(oy) (cf. Example 3.2.3 above). Thus this proposition essentially 
amounts to “strong multiplicity one” for GL(2); for further discussion, see 
pages 23-24 of [La]. 
4.2. Relations with Classical Forms 


Proposition. Suppose o : Gg —>+ Glo(C) is irreducible, and “odd,” and 
let n(o) denote the corresponding automorphic cuspidal representation of 
GL2(Ag) (assuming it exists!). Then (a) corresponds (via the correspon- 
dence f «— my already described) to a normalized new form 


co 


f(z) = S0 ane?™™* € Sy(To(N), p) 


n=1 


with N = conductor(c), w determined by the central character of n(c), 
and 


Gp = trace(o(Frp)) 
for almost every p. 


Proof By Proposition 2.5, it suffices to check that mo, = Too(Goo) is of 
the form m(1, 2), with pw; = 1 and pe = sgn( ). Equivalently, we must 
check that o,, is a sum of these two characters. But when viewed as a 
representation of We, doo is clearly trivial on C*. This means o,, cannot 
be induced from a non-trivial character of CX. Thus o,, must be reducible 
(cf. Example 3.1.1), say the sum of two characters y:,, with pi ~ (ti, €:)- 
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Since 0. is trivial on C*, it follows that each ¢; = 0. On the other hand, 
the assumption det o(7r) = —1 implies that o(r) ts not a scalar; hence 


ar~(y 4): 


which means 1. = Ind1-sgn, as claimed. 


Concluding Remarks. (1) If deto is even, then by the same reasoning 
as above, 0, is the sum of two characters, but now either both trivial or 
both the sgn character. Thus one concludes 7. = a(1,1) or a(sgn, sgn), 
and from Remark 2.5.5 it follows that a, corresponds to a cuspidal Maass 
eigenform of eigenvalue 1/4 for A. 

(2) In [DS], Deligne and Serre associated to each normalized new form 


f@= 2 ane2™"? in $,([o(N), w) an irreducible two-dimensional repre- 
sentation o of Gg, of conductor N and (odd) determinant w, such that 


trace o(Frp) = ap 


for almost all primes p. Taken together with the Langlands reciprocity 
conjecture for F = Q and “odd” o (or equivalently, Artin’s conjecture for 
such o), their result says that new forms of weight one are one and the same 
thing as irreducible, odd two-dimensional representations of Gg (satisfying 
Artin’s conjecture... ). 

(3) One expects an analogue of Deligne-Serre to hold for cuspidal Maass- 
eigenforms (of eigenvalue 1/4), i.e., that to each such form there should 
correspond an irreducible two-dimensional, even representation of Gg (sat- 
isfying Artin’s conjecture), with L(o,s) = L(f,s). But this remains an 
open problem; cf. 4.3 below. 

4.3. Representations of Wr vs. “Arithmetic” Automorphic Rep- 
resentations of GL(2) 

For further reference, it will be convenient to repeat in a more precise form 
the classification of two-dimensional “semisimple” representations 


oO: Wr —_ GL2(C) 


over a number field F’. 


Proposition. Each ao as above is classified according to its image in 
PGL2(C), called the “type” of the representation: 

(i) Cyclic type: ps @v:o is the direct sum of the two one-dimensional 
representations defined by Hecke characters p and v. 

(ii) Dihedraltype: o is irreducible of the form Indy 8, with @ a character 
of EX \ AX, E a quadratic extension of F, and 0 # 07 fort #1 in 
Gal(Z/F). (Such representations are also called monomial.) 

(iii) Exceptional type: The image of a in PGLo(C) is Ag, S4 or As. 
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Now let’s assume that the LRC holds for all irreducible 
a:Wr — GL(C), 


and ask which automorphic cuspidal representation of GLo(Ap) are of the 
form m(o) for some a? A necessary condition is clearly the following. 
Definition. Given an irreducible admissible representation 7 = @7y, and 
a real place v, let oy : Wop —+ Glo(C) denote the Langlands parameter 
of my». Then a is called of type Ago (resp. Ao) if the restriction of oy, 
to C% is trivial (resp. the sum of characters of the form z — z%z° with 
a,b,€ Z). Alternatively, 7 of type Apo (zesp. Ag) is called of Galois type 
(resp. arithmetic). 
Ezmaple 4.8.1. Ifo:Wg —> GLo(C) actually factors through Gal(Q/Q), 
then the corresponding cuspidal 1(c) (if it exists) will be of type Aoo (cf. 
Proposition 4.2 and the Remarks immediately following it). Conjecturally, 
one expects that all cuspidal 7 of GLo(Ag) of type Ap are “motivic,” i-e., 
arise in this way; in case det(c) is odd, this is the result of Deligne- Serre. 
On the other hand, cuspidal (co) of type Ao are related to @-adic rep- 
resentations of Gg or Wg (or the L-series attached to ¢-adic cohomology 
spaces of varieties over Q). This is the subject matter of [Ant] (really a rep- 
resentation theoretic reformulation and strengthening of “Fichler-Shimura” 
theory). For example, if o.. is induced from the character z — 2-"z—™ 
of C, with n > m > 0, let D; denote the discrete series representation 
of GL2(R) of lowest weight kK = n — m+ 1 (and appropriate central char- 
acter). Then Langlands in [La5] associates to 7 = @m,p of type Ao (with 
Too = 1(Tco) = D,) a two-dimensional ¢-adic representation o of G@ whose 
local L and ¢ factors are (eventually) shown to agree with those of mp for 
all p; cf. [Car]. 


§5. The Langlands Functoriality Principle Theory and Results 
All the automorphic results used to prove the Reciprocity Conjecture in 
the two-dimensional solvable case, as well as the LRC itself, are but special 
realizations of what Langlands calls “functoriality of automorphic forms 
with respect to the L-group.” Hence it seems worthwhile to review some of 
the necessary background on “functoriality” in this Section. 

5.1. L groups and [-factors 

Recall that for GLa, an unramified representation tp = 1p(f1, U2) is para- 
metrized by a semisimple conjugacy class in GL2(C), namely the Langlands 


class 
ar ey wate) 


More generally, an arbitrary irreducible admissible representation 7p is 
parametrized by a Langlands parameter 


op: We, — GL2(C), 
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and the “local Langlands Conjecture” says that the same should hold for 
GL, over any local field F,,. Namely, each nice representation a of GL, ( F,,) 
should be attached to a parameter a, : Wr, —> GL,(C). 

For an arbitrary reductive group G, over a local field F,, Langlands 
introduced the notion of the L-group “G to take the place of GL,(C) in 
parametrizing the irreducible admissible representations of G(f,,). Roughly 
speaking, each nice representation of G(F,,) should be attached to a “semi- 
simple” homomorphism 

yp: Wr— tC. 


and in the case of unramified representations, this should amount to fixing 
a certain semisimple conjugacy class in “G (again called the Langlands 
class t;, attached to 7,). 
In general, if G is defined over a local or global field F', then “G is a 
group of the form 
Gx Gal(F/F). 


Here G is the complex Lie group “dual” to the root datum of G (C), and 
the action of Gal(F'/F) on G is trivial if and only if G is split over F; 
cf. §2 of [Bo] for details. It is sometimes convenient to replace Gal(F'/F) 
by Gal(E/F), where E is any Galois extension of F over which G splits. 
Indeed, since Gal(F'/E) acts trivially on G, we can take ’G to be 


Gx Gal(E/F) 


(now a complex reductive Lie group). For example, for G = GL(n), and F 
local or global, we can take 


-G=GL,(C) or GL,(C) x Gal(E/F) 
for any FE. It is also convenient to define a semi-direct product 
LG=Gxz 


for © any group endowed with a homomorphism into Gal(F'/F), for exam- 
ple, the Weil group Wr. Henceforth, we deal almost exclusively with this 
“Weil” form” of #G. 

For the moment, let us also assume that G is unramified over F, i.e, 
quasi-split over the local field F’, and split over an unramified extension EF. 
For such a group (like GL,(F)), an irreducible admissible representation 
is called unramified if its restriction to a very special maximal compact 
subgroup (like GL,(Or)) contains the identity representation (and then 
just once). If F, denotes a Frobenius generator for Gal(E/F), then the 
unramified representations a of G(F) are in one-to-one correspondence 
with the semisimple “G°-conjugacy classes t, in *"G2« Fr Cc #G. The 
resulting bijection 


T — ty 
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attaches to each unramified a its Langlands class t, = gx Fr in “G, and it 
is in terms of these classes, and their matrix realizations, that the general 
Langlands L-factors are defined. 

Definition. By a representation of *G is meant a continuous homomor- 
phism r : “G —+ GLm(C) whose restriction to G is a morphism of complex 
Lie groups. Given such an r, and an unramified representation 7, one sets 


(5.121) L(s,1a,r) = det(1 — 2s ee ee 


where q is the order of the residue class field of F’. 


Example. If F = Q,, 1 = Tp is the local component of a cuspidal rep- 
resentation of GL(2,Ag) associated to a new form in 5;({SL2(Z)), and 
r : GL2(C) —+ GL,(C) is the “standard” representation taking g to g, 


then 
L(s,7,7) = [(1 — pa(p)p-*)(1 — wa(p)p~*)]* 
= (1 zn app > ps 


with ap = p*~)/?(u1(p) + Ha(p)).- 

The question remains: what £-factors can be assigned to irreducible 
admissible + which are not unramified? In the case of GL, (and some 
other classical groups as well now), the local representation theory of G 
may be used to directly define L (and e-factors L(s,a,r) (and e(s,7,r)), 
at least for r sufficiently close to the standard embedding of “G in some 
GL4(C). For example, for GL,, and r: GL,(C) —+ GL,(C) the identity, 
[GoJa] constructs such [ and e factors, and in [Ja] they are related (modulo 
the local Langlands conjecture for GL(n)) to the L and e factors of their 
corresponding Langlands parameter. 

In general, a typical Langlands parameter y for G is a continuous 
homomorphism 


y:Wr —*'G(=Gx Wer) 


such that y(w) is “semisimple” for each w in Wr, and such that the com- 
position with projection onto Wr induces the identity map. Then to each 
representation r : “"G —+ GLag(C), and class of parameters y (modulo 
conjugation by G): one can attach the Langlands factor 


(5.1.2) L(s, 9,7) = det(I — q-*r(y(Fr)|yr))?. 


Here Fr is a Frobenius element in Wr, and V’ is the subspace of C? in- 
variant under the action of the inertia subgroup. Note that when y is un- 
ramified, i.e., trivial on I, then ~ is determined by the semisimple element 
y(Fr) = gxFr in *G and L(s,y,r) reduces to the Langlands L-function 
L(s,a,r), with a the unramified representation of G(F’) such that ¢, is 
conjugate to y(Fr). In general, a form of the “local Langlands conjecture” 
for G asserts that any irreducible admissible 7 is associated to some pa- 
rameter 9: Wr —> /G, and then L(s,7,r) should be L(s,,1r), for any r- 
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N.B. If G is not quasi-split, its Langlands parameters must satisfy 
certain additional “rationality” conditions; cf. 8.2(ii) of [Bo]. 
5.2. Statement of the Functoriality Principle 
A homomorphism between L-groups 


pe EG sFE 


is called an L— morphism if it commutes with the natural projections onto 
Wr. Such a morphism clearly gives rise to a map 


P= poy 


between the Langlands parameters of G and G’, hence (conjecturally) also 
between representations of G and G’ (by the local Langlands conjecture). 

Now suppose F is global, and m = @n, is an automorphic cuspidal 
representation of G(Ar). For almost every place vu of F, G, and my will 
be unramified, and the corresponding Langlands class t, in “G, defined. 
Thus the (partial Langlands) L-function L5(s,2,r) can be defined for any 
representation r: “G — GLa(C) through the formula 


Ban) = II L(s, %y,Tv)- 
die mihied 


(Here each r, arises through composition of the natural embedding 
2G = GxWr, Ca GuWr.) 


Langlands has shown that this Huler product converges in some half-plane, 
and conjectured that it admits a meromorphic continuation to C, with only 
finitely many poles in Re(s) > 0. 

N.B. (1) If one accepts the local Langlands Conjecture, one can also 
introduce L(s, 7y,ry) at the remaining places (as in (5.1.2)), and define the 
“completed” functions L(s,a,r) (and e(s,7,7r)). 

(2) If G = GLp, and r: *G — GL,(C) is the standard representation, 
then L5(s,7,r) is simply denoted L5(s, 7). 

The Functoriality Principle. Suppose that G’ is quasi-split, and that 
p:'G—4G' is a morphism of L-groups. For each uv, consider the corre- 
sponding commutative diagram 


Pu 
By. <2 kG! 


d { 


LG ; LG’. 


Then to each automorphic cuspidal representation 7 = @7, of G(Ar) there 
corresponds an automorphic representation x’ = @ni, of G'(Ar) such that 
for almost all v (where both ty and x/, are unramified) 


bat = Pu (ter, Ds 
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In particular, for any representation r' : *G' — GLa(C), 
L5(s,x',r’) = L5(s,2,1r' 0p). 


Moreover, if one accepts the local Langlands Conjecture for G, then the 
Langlands parameter of m!, should be the image (under p,) of the Langlands 
parameter of Ty for every v. 


Example. Take G = {1}, and G’ = GL(n). In this case, a morphism 
p:"G— "G' =GL(n,C) x Wr 
must be of the form p(1,w) = o(w) x w, with a continuous representation 
a:Wr — GL, (C), 


(and conversely, any Artin representation g : We —> GL,(C) determines 
a morphism p, through this formula). Since G = {1}, its only automor- 
phic representation a is the trivial one, with Langlands class 1» Fr, for 
every (finite) v. Thus the Functoriality Conjecture in this case asserts that 
(for any given o) there is an automorphic representation (7) = @7> of 
GL, (Af) such that 

tx, = o(Fr,) 


for almost every p. 


This example shows that the general Reciprocity Conjecture is but a 
special instance of the Functoriality Principle. Hence it is clear that this 
Principle is more a guiding light than a problem to be solved in the near 
future! 

5.3. Established Examples of Functoriality 
We collect here some instances of “functoriality” which are required for the 
proof of Langlands-Tunnell. 

(A) Automorphic Induction 

This is a generalization of the classical construction of Hecke and 
Maass, whereby an automorphic cuspidal representation of GLo(Q) (a mod- 
ular form, or Maass form, in their language) is attached to each Hecke 
character of a quadratic extension of Q (which is purely imaginary or real, 
respectively). 

For a general formulation, fix a number field F', and K a cyclic Galois 
extension of F' of degree n. Let G denote the group Resx,; GL, (defined 
by “restriction of scalars” from K to F') and let G’ denote the group GLn. 
Then G is isomorphic to a maximal F-torus of G’, and 


LG = (GL,(C) x --. x GLi(C)) Wr, 
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with Wp acting on G through its projection onto Gal(K/F), and (the gen- 
erator of) Gal(K/F) acting through cyclic permutation of the n GL(C)- 
factors of G. Let p; be the natural homomorphism of /G into the nor- 
malizer of a maximal torus of G’ = GL(n,C), and define a morphism of 
[-groups 
p: LG Aw Lq 

by p(9) = (p1(g), p2(g)), where po : "G — Wr is the canonical projection. 
Note that an automorphic form on G is the same thing as a grossenchar- 
acter x of K, since G(F') = K*; and when v splits (completely) in K, 
the representation 7, = 7,(x) (induced from the x,’s above v) satisfies 
tx, = P(ty,) if x is unramified at v. Thus the principle of functoriality 
suggests the following: 


Theorem 5.3.1. For each grossencharacter x of K there is an auto- 
morphic representation m(x) of GLz(Ar) whose L-function L°(s, 7) equals 
the Hecke-L-function L°(s, x); moreover, L°(s,(x)) is entire (and hence 
(xX) ts cuspidal automorphic) if x does not factor through the norm map 
NxFr (equivalently x is not fixed by the natural action of the Galois group 
Gal(K/F)). 


For n = 2 and F = Q, this Theorem follows essentially from the 
classical work of Hecke and Maass. For n = 2 and F arbitrary, it is proved 
in [JL] (using Z-functions), [LL] (using the “stable trace formula”), and [ST] 
(using theta-series); it also follows from Jacquet’s “relative trace formula” 
(cf. §VIII 4 of (Ge2]). For n = 3 it is proved in [J-PS-S2] (using L-functions) 
and for arbitrary n in [AC] (using the trace formula). The only cases needed 
in the sequel are n = 2 or 3, and here it is simplest to (follow [JL] and [J- 
PS-52] and) appeal to the so-called “Converse Theorem to Hecke Theory.” 

For this, suppose that 7 = ®7 is an irreducible admissible represen- 
tation of GL,(Ar) whose central character is invariant under F”. If 7 is 
actually automorphic, then it is known from “Hecke theory” (cf. [GoJa]) 
that a is “nice” relative to any idele class character w of F, i.e., L(s,7@w) 
and L(s,7@®w') are absolutely convergent in some half-place, admit an- 
alytic continuations to the whole s-plane which are bounded in vertical 
strips, and have a functorial equation relating s to 1 — s; moreover, if 7 
is cuspidal, then these analytic continuations are also entire. For n = 2 
or 3, the Converse Theorem (cf. [JL] and [J-PS-S2]) simply says that the 
converse to each of these statements is also true. 

Remarks 5.3.1. (a) In real life situations, like the application to proving 
(x) automorphic in Theorem 5.3.1, the situation is complicated by the 
fact that the representation we are trying to prove automorphic may not 
be easily defined at every place, but rather only at almost all places; thus, 
in fact, a more complicated “almost everywhere” version of the converse 
theorem is needed; cf. §§13-14 of [J-PS-S2]. 

(b) In the paper [Co-PS], Cogdell and Piatetski-Shapiro conjecture that 
the Converse Theorem should also hold for any n, with the additional 
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caveat that for n > 4, w need only be almost everywhere equivalent to 
some automorphic representation of GL, (Ar); cf. [He]. 
(c) If x (in Theorem 5.3.1) is not fixed by any non-trivial element of 
Gal(K/F), then Indw* X = ¢ is an irreducible n-dimensional representa- 
tion of Wr with L(s,a0) = L(s, x) (a Hecke L-series with grossencharacter 
x over K). Hence Theorem 5.3.1 may be viewed as an affirmation of the 
Langlands Reciprocity Conjecture for monomial representations. 
(d) Note that “on the Galois side,” induction brings a Langlands parameter 
for GL, (over K), namely x: Wx —> C%, to a Langlands parameter for 
GL over F', namely o = Indx : Wr —> GLo(C). “On the automorphic 
side,” this map is reflected by the correspondence x —> a(x) = m(c) 
(hence the aptness of the terminology “automorphic induction”). 
(e) Finally, we note that in case n = 2 or 3, the “converse theorem ap- 
proach” to Theorem 5.3.1 does not depend on K being a normal (Galois) 
extension of F'. This will be crucial in the application to the Reciprocity 
Conjecture in the Octahedral case; cf. §7.2. 

(B) The Symmetric Square Lifting 

Let A denote the three dimensional representation of PGL2(C) deter- 
mined by the adjoint action of PGLo(C) on the Lie algebra of SL(2,C), 
and denote the resulting (three-dimensional) representation 


Ad 
GL.(C) ———— GL;(C) 
A/ 
PGL2(C) 
of GL(2,C) by Ad. This representation Ad may be viewed as a natural 
morphism between the L-groups of GL(2) and GL(3). 


Theorem 5.3.2. (The “Symmetric Square” Lift from GL(2) to GL(3); 
cf. [GeJa]). 


(i) To each cuspidal automorphic representation n of GLo(Ar), there exists 
an automorphic representation II of GL3(Ar) such that for almost all v, 


Il, = II,(Ad(a,)) 
whenever Ty = Ty(y); equivalently, 


th, = Ad(tz., )- 


(ii) This lift of x to GL(3) ts cuspidal automorphic unless + is monomial, 
1.€e., of the form n(a), with o induced from a Hecke character of some 
quadratic extension K. 


Method of Proof The “converse theorem for GL(3)” says that L5(s,7, Ad) 
will be the L-function of an automorphic representation II of GL(3) (with 
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in, = Ad(tx,)---) as soon as L5(s,m, Ad) is shown to have the expected 
analytic properties; moreover, this II will be cuspidal if and only if all 
L*(s,1 @w)’s are entire. To establish the required analytic properties, it 
is shown (following [Sh]) that 


L°(s,1,Ad) = As(s) / Yx(g)O(g)E(g, 8)dg. 
SLa(F) \S22(A) 


Hence y, belongs to the space of 7, O(g) is a theta-function on Weil’s meta- 
plectic group, E(g,s) is an Eisenstein series of half-integral weight which 
is real analytic in g and meromorphic in s, and Ag(s) is a meromorphic 
function which at the possible poles of E(g, s) can be chosen non-zero. 

N.B. The idea of using the integral of an automorphic form to derive 
analytic properties of its Z-function of course goes back to Hecke, and even 
Riemann. But the idea of mixing automorphic forms in the integral with 
Eisenstein series was first systematically developed by Rankin and Selberg, 
and is now a flourishing industry; cf. below. 

(C) Rankin-Selberg Products (Especially GL(3) x GL(3)) 

Underlying this work is the following instance of Langlands functorial- 
ity. Viewing GL(k,C) as the L-group of GL(k), and GL(n, C) x GL(m, C) 
as that of GL(n) x GL(m), consider the natural L-group morphism 


p: GLn(C) x Glm(C) 8+ GLam(C) 


given by the tensor product map. 

So far, there seems no hope of establishing Langlands functoriality in 
this case, i.e., of proving the existence of an automorphic II on GLyz, such 
that ty, = tp, @tx. for two given cuspidal representations 1’ on GL, and 7 
on GL,. Indeed, this is an important open problem, whose solution would 
play a crucial role in finding “the” group whose irreducible representations 
are expected to parametrize all the automorphic cuspidal representations 
of GL, (not just those “arithmetic” ones coming from representations of 
Wr); cf. [Ram] for further discussion along these lines. A big first step, 
however, was taken by Jacquet and Shalika: 


Theorem 5.3.3. (cf. [JaSh1,2] and [Mo-Wald]) Given cuspidal represen- 
tations t on GLy, and mr’ on GLm, let LS(s,m x a’) denote the partial 
L-function 

] [ [detZ - (te, @ tay QI. 

v¢éS 


(i) L5(s,m x 2’), originally defined only in some right half-plane, extends 
to a meromorphic function in all of C, with functional equation relating the 
value ats to the value at 1—s. 
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(ii) L9(s,a xm’) may be “completed” to an Euler product 


L(s,x x 1’) = II L(s, my X 71,) 


all v 


which ts holomorphic on Re(s) > 1 if m4 n, and otherwise has a pole at 
s with Re(s) = 1 if and only if |det( )|°-! @ m & @ (the contragredient of 
m). 

As already suggested, the proof of this result constitutes a non-trivial 
representation-theoretic generalization of the classical integral representa- 
tions of Rankin and Selberg; see [Ja] for the case of GL(2) x GL(2). In the 
sequel, we need only the case n = m = 3. 

Concluding Remarks. (1) There is one more example of functoriality 
needed for the proof of Langlands-Tunnell, namely the theory of base- 
change of Saito, Shintani and Langlands. However, since that theory is 
so intimately tied up with Artin’s conjecture, and its proof relies on the 
trace formula rather than L-functions, it seems convenient to postpone 
discussion of it until the last lecture. 

(2) There are of course large aspects of the Langlands Program which we 
have not seriously broached here because they have no immediate bearing 
on Wiles’ work. Perhaps the most obvious such topic is the (conjectured) 
relation between Hasse-Weil zeta-functions of algebraic varieties (“motive” 
L-functions) and automorphic L-functions of type L(s,a,r). For exam- 
ple, in [La6] the zeta-functions of certain Shimura varieties are related to 
automorphic [-functions of degree 2”. This “program” represents the be- 
ginnings of a higher dimensional analogue of the theory of EFichler-Shimura 
and has greatly influenced much of the work during the last twenty years 
in representation theory and the theory of automorphic forms. Among 
other things, it pushed to the forefront the need to refine and generalize 
the “Selberg trace formula”; more about this in the next lecture. It also 
brought into representation theory such crucial but different concepts as 
“f-indistinguishability,” “endoscopy,” “L-packets,” etc., and encouraged 
the use of new algebro-geometric methods for counting points on these 
varieties. 

(3) Finally, one should say a few words about the relation between the 
Langlands Program and the Shimura-Taniyama-Weil Conjecture. Person- 
ally, I do not think that it is so significant that the Langlands Program 
actually includes the S-T-W conjecture as a special “example” (and that’s 
why I haven’t bothered to broach the topic here). After all, Taniyama 
obviously made his Conjecture —- and Shimura and Weil understood its 
importance — before the Langlands Program was conceived. Also, from 
the other point of view, it is equally clear that including the S-T-W Con- 
jecture inside the Langlands Program is more incidental than crucial to 
the Program. Rather the crux of the Program is two pronged: its overall 
vision relating motives of all kinds to automorphic representations, and its 
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methods which push representation theory to the forefront, and infuse the 
subject with a seemingly endless string of challenging problems. It is these 
aspects of the Langlands Program which (albeit indirectly) play a role in 
the proof of Fermat’s Last Theorem. 
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Lecture ITI 
Proof of the Langlands-Tunnell Theorem 


Abstract 
Our task is to describe the proof of the following: 


Theorem. Suppose F is a number field and the irreducible representation 
a:Wr — GLo(C) 


has a solvable image in PGLo(C). Then there exists a (unique) irreducible 
automorphic cuspidal representation 1(a) = @1, of GLo(Ar) such that 


trace(o(Fr,)) = trace(tz, ) 
for almost every v. 


The crucial instance of the Functoriality Conjecture required in the 
proof of this Theorem is the theory of “Base Change” as developed in 
[Lal]. This we describe in §6, along with its proof, which relies heavily on 
trace formula methods. The application of base change to the Langlands 
Reciprocity theorem is explained in §7, the proof of the actual theorem 
proceeding in two steps: first the base change (trace formula) methods are 
exploited to produce the best possible candidate for a(o) (which is called 
Mps(o)); then the results from the theory of L-functions (recalled in §5) are 
used to prove that 7,,(c) actually equals m(c). 


§6. Base Change Theory 

(6.1). Fix EF a cyclic extension of the number field F, of prime degree £. 
Roughly speaking, the theory of “base change” describes the correspon- 
dence between automorphic representations of the groups GL,(Ar) and 
GL,(Az) which reflects the operation of restriction of Galois representa- 
tions of Wr to Wer. The first results on base change for automorphic forms 
(or representations) used the theory of L-functions, and were restricted to 
the case of quadratic FE and Glo. The introduction of the trace formula 
is due to H. Saito, who dealt with GLeg and arbitrary cyclic F using the 
classical language of automorphic forms; cf. [Sai]. Immediately after that, 
Shintani reformulated Saito’s results using group representations, and gave 
the correct local definition of base change lifting; cf. [Shin]. Finally, Lang- 
lands saw the connection with Artin’s conjecture, and reshaped the trace 
formula proof for GL2 in a form suitable for the later generalization to GL, 
developed by Arthur and Clozel; see [Lal] and [AC] for a more detailed 
history. Since only the case n = 2 is required here, we restrict ourselves 
henceforth to this case. 

Definition. Suppose 7 = @7y is an automorphic cuspidal representation 
of GLo(Ar), and II = @,, II, is an automorphic representation of GLa(Az). 
Then IT is a base change lift of 7, denoted BC g/r(7), if for each place uv of 
F, and w|v, the Langlands parameter attached to II,, equals the restriction 
to Wz,, of the Langlands parameter a, : Wr, —> GLo(C) of 7. 
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Remarks. (i) The above (essentially local) definition of base-change lifting 
is at the level of Langlands parameters rather than representations. The 
key idea of [Shin] is to define the lift of 7, on Glo(f,) directly in terms 
of a character identity between 7, and the extension of II,, to the group 
GLlo(Ew)™ Gal(Ey/F,). Implicit here is the fact that IL, is Gal(Hy/Fy) 
invariant and hence this extension, call it IL,,, exists. If 7 is a generator of 
Gal(E£.,/F,) the character of this identity reads 


Xt, (947) = Xa, (2) 


whenever Ngre(g) = gv --- 97g is conjugate in GLo(E,,) to a regular 
semisimple element x of G(Fy). 
(ii) Functoriality. Let G = GL2 and set G’ = Resg/r(G). As recalled in 


§5.3, G’ is then a product of £ copies of GL2(C) indexed and permuted by 
Gal(E/F). So let 


p:'G =GL9(C) x Wr — *G! = G'uWr 
be the natural morphism which takes g x w in “G to (g,---,g)™w in *G’. 
The Functoriality Principle suggests the existence of a map taking automor- 


phic cuspidal representations 7 of GLo(A) to automorphic representations 
II of GLa(Ag) & G’(AF) such that for 7, and IL, unramified, 


(*) th, = Pu (e<2): 


Using either definition of lifting given above, it is easy to check that if 
Ty = W(p1, #2) (with pw; an unramified character of F,,) then 


BCg/Fr(tv) = Il(, V2) with V,= iO Ng, /F,- 
From this it follows that (*) holds, i.e., Base Change is functorial. 
N.B.. In verifying that (+) holds, one must keep in mind that (1, v2) 


(viewed as a representation of GLo(F,,)) corresponds first to the Langlands 
class g x o in GLe(C) x a, with 


= (ET oN) 


= (OO ea?) 


but the corresponding class in 


LG! = GLo(C) x GL2(C)xo 
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(viewing m(v1,V2) as a representation of G’(F,) + GLo(E.,)) is 


(Ge a Gr en °), 


ie., just p(t,,). Indeed, the Hecke algebras of GLo(E,,) and G’(F,) are 
the same, and if f,, and f/, represent the same element in this algebra H., 
then (f/,)"(91,---,9¢X 7) = f2, (ge --- 9291); see 6.3 below for definitions of 
Hw» and the Satake isomorphism (f’)’. 

(iii) Because we are assuming E over F' cyclic of prime degree, each v of F 
either remains inert or splits completely. In the later case, it is clear that 
Ey ~ F, for any w|v, and the base change lift of 7, is just IL, + wy. This 
case being trivial, we usually assume (as above) that we are dealing with 
the inert local case. 


Theorem. (cf. [Lal]) 

(a) Every cuspidal representation 7 of GLo(Ar) has a unique base change 
lift to GLe(Ag); the lift is itself cuspidal (as opposed to “just” automor- 
phic) unless E is quadratic over F', and 1s monomial (or dihedral) of the 
form x(a), with o = IndyF 0. 

(b) If two cuspidal representations x and 1m’ have the same base change lift 
to E, then a’ 7 @w for some character w of F* Ng/r(Ag) \ AF- 

(c) A cuspidal representation II of GLo(Ag) equals BCg/r(m) for some 
cuspidal 7 on Glo(Ar) if and only if II is invariant under the natural 
action of Gal(E/F). 


In some ways, the proof of Base Change is as interesting as the result 
itself. Since it involves a form of the trace formula which should (and does) 
generalize, and apply to other instances of functoriality, we devote some 
time to it below. 

(6.2). The Trace Formula of Arthur-Selberg 
Recall that the right regular representation Ry of G(A;) in the space of 
cusp forms L2(G(F) \ G(Ar),w) decomposes discretely as 


Ryo = Ss) MT, 


and it is the cuspidal constituents 7 which are the building blocks of the 
theory of automorphic forms on G. What the “trace” in “the trace formula” 
refers to is the distributional trace of Rp. More precisely, suppose f(g) is 
any nice compactly supported “test function” on G(A), and define the 
operator Ro(f) on L2(G(F) \ G(A),w) through the formula 


Ro(f) = | f(9)Ro(g)dg. 
Z(A) \G(A) 
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(For simplicity, assume that the central character w of Ro is trivial.) Then 
clearly 


trace Ro(f) = mr, trace(m(f)); 


but as we know next to nothing about the m’s which occur in Ro, we 
also know next to nothing about trace(Ro(f)). The original idea of the 
trace formula was to give an alternative formula for trace Ro(f), which 
ultimately gives some of the sought after information about Rp and its 
constituents 7. 

The original trace formula was introduced by Selberg, in the context 
of a semisimple Lie group G and discrete subgroup I (in place of our G(A) 
and G(F)). In his famous 1956 paper [Sel], Selberg first of all described a 
general formula for the case of compact I \ G (equivalently G(F) \ G(A)); 
it took the form 


(6.2.1) trace Ro(f) = 5 mz tracem(f) = 5) m,®;(7) 
XT {y} 


with {7} running over the conjugacy classes in G(F’), and each (7) an 
“orbital integral” 


f(g~'y9) do. 
G+(A) \G(A) 


Secondly, Selberg treated in detail certain non-compact quotient cases such 
as SLo(Z) \ SL2(IR), which already required the analytic continuation of 
Hisenstein series to handle the continuous spectrum of L? outside L?. 

Subsequently, in the 1960’s and 70’s, Langlands developed a general 
theory of Eisenstein series valid for any reductive group G, and Arthur 
used it to develop a general trace formula in the context of not necessarily 
compact quotients G(f) \ G(A). The resulting trace formula of Arthur 
takes the form 


(6.2.2) > Jo(f) = >> IC f)- 
i) x 


Here the left (or geometric) side of Arthur’s formula is a sum of special 
types of equivalence classes in G(fF’) (generalizing the ordinary notion of 
equivalence in the case of compact quotient), and the sum on the right (or 
spectral) side is over certain classes of automorphic cuspidal representations 
of “Levi subgroups” of G (as opposed to just the cuspidal representations 
of G(A) itself in the case of compact quotient). Although it looks like the 
“trace” has been lost in Arthur’s trace formula, this is not really so; certain 
of the spectral terms J,,(f) add up to exactly trace Ro(f), and so making 
(6.2.2) explicit still (ultimately) gives us the information we seek about 


trace(Ro(f)). 
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Now instead of focusing efforts on finding an explicit, new formula for 
trace(Ro(f)), it was the idea of Langlands to compare the trace formulas for 
two different groups G and G’, in order to find relations (presumably “func- 
torial”) between their automorphic representations. For example, this is 
exactly the strategy exploited in §16 of [JL] to establish the correspondence 
between automorphic cuspidal representations of G = D* and G’ = Gla, 
D a division quaternion algebra over F'. In the case of G = Gla and 
G’ = Resg/rG this strategy brings us back to the proof of Theorem 6.1 
which we now explain. 

(6.3) The Proof of Base Change. 
If + denotes a generator of Gal(#/F'), then 7 acts naturally on G’(A) = 
GL3{Ag), and hence on £?(G"(F) \ G(Af)) through the rule 


(7-~)(9) = (9). 


We can also define a twisted regular representation R” through the com- 
position of R with 7. Then for any nice f’ on G’(A;), there is a “twisted” 
version of the trace formula of the form 


(6.3.1) > FF) = > FF): 
0’ x 


with the “twisted” trace(Rj(f’)) hidden inside the right side of (6.3.1), 
and “twisted” orbital integrals ®%,(7’) = f f’(g~7yg)dg on the left. The 
significance of working with a twisted formula for G’ is that then only Galois 
invariant cuspidal representations II will contribute (to trace(Rj(f’)))!; 
hence we might indeed establish the desired base change map 7 —> II 
between G and G’ by relating (6.3.1) to (6.2.2), and ultimately trace(Ro(f)) 
to trace(R5(f’)). 

The first step is to prove that the left-hand (i.e., geometric) sides of the 
trace formulas for G and G’ coincide, at least for certain “matching” f and 
f’. This matching is a non-trivial local step, which first of all requires that 
the orbital integrals ®¢ (N7’) on G, match the twisted orbital integrals 
®7, (7") on Gi,. Moreover, it must be shown that this matching f’ —> f 
is compatible with “base change at the unramified places,” in the following 
sense: 

If H, denotes the Hecke algebra of bi-K,-invariant (compactly sup- 
ported smooth) functions on G,, each f, in 71, may be viewed as a function 
on “G, through the formula 


fy (t) = trace t, (fy) 


whenever tz, = t. (This is the Satake isomorphism f, —+ f¥, defined 
analogously for the Hecke algebra 7#//,, of G!,,.) Then the base change map 


1 This is because 7 permutes the constituents II of Rp, and a permuta- 
tion matrix without fixed points has zero trace... 


MODULARITY AND THE LANGLANDS RECIPROCITY CONJECTURE 197 


of Hecke algebras, dual to the base change morphism p : "G, —> *G/,, is 
defined by 
pe” = (fu)” —> fi(g9) = (fa)” (6(9)); 

and the compatibility condition mentioned above is that f/, will match its 
image p’ (f/,) = f, in the above sense, for any f/, in H/,. (This is what is 
known as the fundamental lemma, in the context of “base change.” ) 

The next step, which takes a great deal more work on the spectral sides 
of (6.2.2) and (6.3.1), is to conclude from the equality of the geometric trace 
formulas that 


trace Ro(f) essentially equals = trace R5(f’) 


“for such matching f and f’. Equivalently, 


(6.3.2) 3 trace m(f) = > trace(T O n')(f') 
easpiael uv erie 


with the sum on the right only over Galois fixed x’. 
Now for 7y,7,, fv, f,, all “unramified,” we will have 7/, equal to the 
base change lift of 7, if and only if 


trace( o 7/,,)(fi,) = trace Ty (fy) 


for any f, the base change image of f/, as above. (Indeed, in terms of Satake 
transforms, this last identity reads (f/,)” (tn, ) = AY (tx, ) = (£1,)” (o(te,)), 
ie., tr, = p(t, ), aS required.) 

In this way, with a “linear independence of characters” argument, 

(6.3.2) ultimately implies that for a given m occurring on the left-hand 
side, there must be a 7’ on the right-hand side which is Galois invariant, 
and almost everywhere the base change lift of a (and, conversely, all such 
Galois invariant 7’ thus arise). Thus the required correspondence is estab- 
lished. 
Remark. Whenever the trace formula can be used to establish an instance 
of functoriality (like base change above), it offers the additional bonus of 
characterizing the image of the automorphic representations in question. 
This is not so for the method of [functions (witness the example of the 
lifting from GL(2) to GL(3), where the image is left uncharacterized, or 
Proposition 7.2 below giving base change for non-Galois cubic E). 


§7. Application to Artin’s Conjecture 

The idea of applying “base change” to attack Artin’s Conjecture arises 
from the following observation. Suppose that for any 0 : Wr —> GLo(C) 
there really is a corresponding cuspidal representation (a) of GLo(AF). 
Then it follows from the original definition of base change lifting that 


BCg/r(a()) = 7(Res olwz) 
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for any cyclic extension # of F. This means that if we start with o, 
and want to find candidates for m(c), then the thing to do is to pick an 
E such that 1(Reso|w,) is already known to exist, and look among the 
cuspidal 7’s such that BCg/r(7) = (Resolw, ). In this way the following 
“obvious” strategy unfolds: Among the possible candidates for 7(c), pick 
a “best possible” one, call it tps(o) (for Tpseudo(o)), and then prove that 
Tps(7) must equal m(c). Roughly speaking, the first step uses the trace 
formula (via base change), while the second uses [-functions. 

Convention. Henceforth, if we are given o : We —+ GL»(C), and any 
field EF over F’, then by og we denote the restriction of 0 to Wr. 


(7.1). The Tetrahedral Case 
(a) Choosing Tps(c) 
We are given an irreducible representation 


o: Wp —> GL2(C) 


whose image in PGL2(C) is isomorphic to Az. This group is solvable, with 
composition series 
Aag> Dod {e}. 


(In general, D, will denote the dihedral group of 2n elements; in this case, 
Dz is the Klein 4-group). Since A4/D2 = A3 = Zs3, the inverse image of 
Dz in Wr under the map 


Wr —> As C PGL2(C) 


is a (normal) subgroup of index 3, hence the Weil group of a cubic extension 
of F, call it E. Pictorially: 


1 —> We ae Wr Saran 4 Gal(E/F) — + J 


rE. il |: 


1 — Dy —> As — Z3 Sis 1 


Thus the resulting representation of : We —+ GLo(C) is “monomial” in 
the sense of Proposition 4.3. 

Let 1(o@) denote the automorphic cuspidal representation of GL2(Az) 
attached to this monomial representation by Theorem 5.3.1. This repre- 
sentation of GLo(Ag) is clearly invariant under the action of Gal(E/F); 
indeed, 1(on)’ = 1(0%) = 7(cxR). So by (the Base Change) Theorem 6.1, 
m(og) will be the base change lift of exactly three classes of irreducible 
cuspidal representations 7; of GLo(Av), each one related to the other by a 
twist w o det for some character w of F* Ng r(Ag) \ AZ, ie., 


1; = 7; Q@wodet. 
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These 7;’s are our natural candidates for 7(¢). 

Recall that the central character of (co) is to be deta. On the other 
hand, the central character w; of each 7; above “base change lifts” to the 
central character of 7(og), which is detog = (deta) o Ng;r. Since each 
wi = ww" if m= 17; @wo det, it is clear that exactly one of these m;’s has 
central character det a, and this is the one we choose to be 7,.(¢). 

(b) Proving tps(a) = (a) 

Write 7,5(0) = @a,. Then for each v, 7, = 1,(o1,) for some 


o,: Wr, — GL2(C), 
and what we must prove is that 
(7.1.1) 7 =o; 
for almost every v. 
Note that the restriction of of, to We, (for wv) is by construction 
the same as the restriction of o, to Wz,,. Thus there is nothing to prove 
in case v splits (completely) in FH, and we henceforth assume FE, cubic and 


unramified over F,,. 
If Fr, denotes a Frobenius element of Gal(E.,/F,) we can suppose 


(Bw) = (7 2) and ost) =(F 2) 


for some a,,by,c,,dy in C*. Then to prove (7.1.1) it will suffice to prove 


that ( : ) is conjugate to 6 ), But the fact that o, and of, 


have the same restrictions to We, means that o,(Fr,)° is conjugate to 
a!,(Fry)* (since Fr? belongs to Wz,,). Thus 


az 0 ; eae cc 0 
0 8 is conjugate to 0 a): 


In particular, for some pair of cube roots of 1, say € and €’, either 


Cy =a, and dy= EDs, 
or else 


Cy =fb, and dy=€£'a,. 
We claim now that ¢’ = €7. Indeed 7p,(7) was chosen so that 


Wroa(o) = det(c). 
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Since this implies det o/, = det oy, we must have £4’ = 1, ie., €’ = £7. So 
to prove (7.1.1) it will suffice to prove 


(7.1.2) €=1. 
To continue, let us assume (for the moment) that 
(7.1.3) Adooi, = Adooy. 
Since the kernel of Ad : GL2(C) —+ GL3(C) is precisely the group of scalar 


0 A 
differ by some scalar A 4 0. Thus 


Edy 0 . : Ady 0 
( 0 £2b, is conjugate to Oo: Ae? 


and it suffices to prove 


matrices { G ) \ it follows from (7.1.3) that o,(Fr,) and o/,(Fr,) must 


A=1. 


If Aa, = £a, and Ab, = €7b, then A = € = €? = 1 for the trivial reason 
that € is a cube root of 1. On the other hand, if Xa, = €7b, and Ab, = £ay, 
then \? = 1 (since ay = €7/Aby = (A/E) by). If A = —1, this means that 


the image of 
ay 0 = ay 0 1 0 
ma (F ACG a) (0 a) 


in PGL2(C) is of order 6 (since €A will then have order 6). But as A, has 
no elements of order 6, this means we are done. 

It remains to prove (7.1.3). For this, we note (following Serre) that 
Adog : Wr —> GLa(C) is a monomial representation. In particular, there 
is a character 9 of Wg (not invariant by Gal(£/F)) such that 


Adoo = Indy 6. 


This means (again by Theorem 5.3.1, this time with n = 3) that there is 
associated to this irreducible representation Ad oo a cuspidal automorphic 
representation of GL3(Ar), call it Il. On the other hand, by the “sym- 
metric square lift” (Theorem 5.3.2) p5(¢) has a lift to GL3(Ap), call it Ij, 
which is almost everywhere associated to the Langlands parameter Ad oo/,. 
Thus to prove (7.1.3), it clearly suffices to prove that 


(7.1.4) Il, ~ II. 


N.B. The automorphic representation [Ij will be cuspidal automor- 
phic (by Theorem 5.3.2) if and only if a,5(0) is not monomial. But if 
Mps(o) were equal to 1(o’) for any irreducible two dimensional (let alone 
monomial) representation of Wr, we would have to conclude that o’ =o 
(which is impossible, since o is tetrahedral, not monomial). Therefore II} 
is also cuspidal, and the proof of 7.1.4 reduces to the following: 
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Lemma. The Rankin-Selberg L-function L(s, Tt x Il,) on GL(3) x GL(3) 
has a pole at s = 1 (and so, by Theorem 5.3.3, If ts indeed tsomorphic 
to II; ). 


Proof By definition 


L(s, Tf x Th) = [| L(s, (p)e x (Ths), 


v 


where for almost every v (namely the “unramified” v), 
L(s, (IIf)» X (Th)») = L(s, (Ad oa!) @ (AdoG,)). 


Keeping in mind that Ad on is monomial, it is possible to check that we 
also have 


(7.1.5) L(s, (II)y x (Il1)y) = L(s, (Th)» < (Th)y) 


(again for almost every v). Indeed, since Adog is induced from @ on E, we 
have 


Ado(éy) =P Indy 9,7. 


wiv 


Hence 
Ad(o',) ® Ad(éy) = GD Indy (6,1 @ E",) 


wiv 
if X., (resp. X/,,) denotes the restriction of Ad(o,) (resp. Ad(o/,)) to Wz,. 
(Here we are using the fact that for o (resp. ©) a representation of some 
group G (resp. a subgroup #), 
g @ Ind © & Ind$(= @ Res o|z).) 


Similarly we have 


Ad(oy) ® Ad(é,) = GP Indy (0,1 ® Zw). 


wiv 
So since &,, = X/,, almost-everywhere (by construction), we indeed have 


L(s, (IIt)» x (Ili) 4) = L(s, Ad(oi,) ® Ad(é,)) 
= L(s, Ad(a,) ® Ad(a,)) 
= L(s, (M1). x (I1).) 


for almost every v. 
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Using (7.1.5), it remains to show that II} = Ij. So suppose (7.1.5) 
holds for all v outside the finite set S. Then 


pp Leet x (the) 


L(s, It x Tl) = rl 
epetien 0 L(s, (th) x (iht)w) 


- L(s, 1, x T;). 


But by Theorem 5.3.3, D(s, II, x II,) has a pole at s = 1; moreover, the 
quotient expression in parentheses above is non-zero at s = 1. Therefore 
L(s, Ij x TI;) also has a pole at s = 1, as asserted, and this in turn implies 
(by the same Theorem 5.3.3) that II} = II. 

(7.2). The Octahedral Case 

(a) Choosing 7 ps(c) 

In this case, the image of o(W) in PGL2(C) is S4, and the pull-back 
of the normal subgroup Aa C S4 is the Weil group Wg of a quadratic 
extension E of F. 

Pictorially: 


1 — We — Wr —- Gal(E/F) — 1 


| ! [: 


1 — Ag — Sa SS Za — | 


Since og = Reso |w, is now of tetrahedral type, we know 7(a,) exists as 
an irreducible cuspidal representation of GL2(Az) (by the results of the last 
paragraph). Moreover, we again have 1(og) invariant under the action of 
Gal(E/F). So again by Theorem 6.1, we conclude that (oz) must equal 
BCg,r(m) for (this time) two irreducible cuspidal representations 1; of 
GL2(Ar). The problem now is that we can no longer distinguish these 7;’s 
by their central characters. Indeed, 7, now equals 72 ®w for a quadratic 
character of F* \ A*; hence wz, = W_,W? = Wy! 

Tunnell’s contribution to the “Langlands-Tunnell Theorem” was to 
get around this problem by appealing to a new kind of base-change which 
appeared only after the publication of [Lal], namely the following result: 


Proposition. (cf. [J-PS-S3]) If L is a cubic not necessarily Galois exten- 
sion of F, then each automorphic cuspidal representation m of GLo(Ar) 
has a base change lift 1 on GLo(Az), 2.e., I = BCz (m) ts automorphic, 
and for almost every place v of F, and place w of L dividing v, ty = Ty(dv) 
implies Il, = (Res, 7, (7v))- 


The proof of [J-PS-S3] uses the theory of Z-functions for the groups 
GL(3) and GL(2) x GL(3) (and is entirely analogous to Jacquet’s original 
proof of base change for GL2 over a quadratic extension in [Ja]). The idea 
is to introduce the representation II on GL2(Az) through the formula 


Ls, x x) = L(s,m x m(x)); 
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here x is any Hecke character of L, 7(x) is the corresponding automorphic 
representation of GL3(Ar) (whose existence is assured by Theorem 5.3.1 
in the non-Galois case — recall Remark 5.3.1 (e)), and L(s,a x m(x)) is 
the Rankin-Selberg Z-function on GL(2) x GL(3). Then one shows that 
L(s,II x x) has the analytic properties required by the Converse Theorem 
to ensure that Il is automorphic. (The fact that each II, is the base change 
lift of 7, is relatively easy to check, from the definitions.) 

N.B. The trace formula methods of [La1] fail in this context precisely 
because there may not be any Galois group attached to L over F' (hence 
no way to define the twisted trace Rj ---). On the other hand, because 
[-function methods are used, there is no way to characterize the image 
of this base change map; fortunately, as we shall now see, there is also no 
need for this in the application Tunnell found for this result. 

What Tunnell did in [Tu] is introduce L/F as the cubic (non-normal) 
subextension of K/F fixed by a 2-Sylow subgroup (of order 8) of S4. (More 
precisely, L is the cubic subextension fixed by all elements of Gal(K/F') 
mapping to this chosen Sylow subgroup.) Then if M is the composition in 
K of L and E (the quadratic Galois extension chosen above), we have the 
diagram shown in Figure 1, 


K 
Dj | 
M Aa 
Zo | | | 2 Sa 

$3 L E 
\ | 
F 

Figure 1 


and the crucial: 


Lemma. (cf. [Tu], page 174) There is a unique i = 1,2 such that 
BCz jr(m) = m(orz) 


(and this is the m to be designated as ms(c)). 


Proof Note first that w(o,) actually exists, since the 2-Sylow subgroup 
used to define L is just Da, and therefore ao, is monomial; similarly, 
BC, /r(m:) exists for i = 1,2 by the Base Change Theorem quoted above. 
To prove the Lemma, one appeals to the identity 


BCy/1t(BCz/r(7)) = (om) for z= 1-2. 
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(This is “transitivity of base change”; it follows immediately from the def. 
inition of base change.) Since BCz,/7(m2) and BCz,/(71) have the same 
(quadratic) base change to M, it follows that 


BCz/r(m2) & BCz/r(m71) @ wuz. 


Now we claim that the representations BC;,7(7;) are distinct for i = 
1,2. Indeed, if they were not, we would have 


BCzjr(m) © BCz/r (m1) @wyyz, 


which by Lemma 11.7 of [Lal] implies 7, is “monomial.” By part (b) of The- 
orem 6.1, this would then imply BCyz;,(BC,;7(71)) = (om) is not cuspi- 
dal. But the image of a7 in PGL2(C) is S3 % D3, which means that oj, it- 
self is monomial and irreducible, i-e., (on) is cuspidal. This contradiction 
establishes that BC; /7(71) and BC;/(m2) are the two (distinct) cuspidal 
representations of GLo(Az) yielding 7(o1) upon base change to M. Since 
we also have Bygjp(o,) = 7(om), it must be that nor) = BCz/r(m) for 

(exactly) one 2, as required. 

(b) Proving mys(7) = m(o). 

Write 15(0) = @7,(01,) as before. Then one proves exactly as in the 
tetrahedral case (but without having to take a lift to GL3(C)) that the 
non-existence of an element of order 6 in Sq implies o, = o(, for almost 
all v. Since no new ideas are involved, we simply refer the reader to [Tul] 
for details. 
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SERRE’S CONJECTURE 
Bas E/DIXHOVEN 


The aim of the first section is to state Serre’s conjecture and to tell what 
is presently known about it, without proof. We start by recalling what 
modular forms are. Then we recall the result, due to Deligne, that to a 
mod p modular form one can associate a mod p Galois representation. After 
that we state Serre’s conjecture and what we know about it. In Section 2 
we will see which cases of it are actually needed in order to prove, following 
Wiles, that all semi-stable elliptic curves over Q are modular. In the last 
two sections we will sketch the proofs in those cases. These notes follow, 
to some extent, the lectures given by Dick Gross during the conference. 


1 Serre’s Conjecture: Statement and Results 


Let N > 1 and k be integers and let R be a Z[1/N]-algebra. We will 
first recall Katz’s definition [22] of modular forms of level N and weight 
k over R. This definition may seem more complicated than necessary, 
but it is very convenient in order to deal with modular forms over fields of 
positive characteristic p. For example, one has the Hasse invariant, (see [24, 
§12.4]), which cannot be lifted to characteristic zero as a form of level 1 
and weight p — 1 for p equal to 2 or 3, and one has the derivation qd/dq 
sending modular forms to modular forms, increasing the weight by p+ 1 
(p being the characteristic of the finite field). A good reference for more 
details concerning modular forms in various settings is [14]. 

Let [['1(V)]r denote the category whose objects are pairs (EH/S/R, a), 
with S an R-scheme, E'/S a generalized elliptic curve in the sense of [10] and 
a:(Z/NZ)s5 — E|[N] an embedding of group schemes such that the image 
of a meets all irreducible components of all geometric fibres of E/S. The 
morphisms from (E"/S'/R,a’) to (E/S/R,a) in [[1(N)|z are Cartesian 
diagrams: 


ff = £- 
(1.1) 1 1 
SS - § 


which are compatible with a and a’. For a generalized elliptic curve E over 
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a scheme S$ we have the invertible Os-module wg := 0*Q, js» obtained 
by pulling back the sheaf of Kahler differentials of E over S by the zero 
section 0 in E(.S). A modular form f of level N and weight k over R 
is then a rule, that assigns to every object (H/S/R,a) of [[i(N)]r an 
element f(E£/S/R, a) of wPig(5), compatible with morphisms in [[1(N)]p. 
The R-module of such modular forms will be denoted by M(N,k)pr. Let 
us give one example: for p a prime, the Hasse invariant is an element of 
M(1,p—1)p,. 

There are more down to earth ways to describe M(N,k)p. For N > 5 
the category [['1(V)]z has a final object, called (Euniv/Xi(N) pr, Guniy), and 
in that case one simply has M(N,k)r = H°(X1(N)r,w®*). Recall that 
X,(N)p is asmooth projective curve over R whose fibres are geometrically 
irreducible. For N > 1 and n > 3 invertible in R one has the description: 


(1.2) M(N,k)r = H°(M((Pi(N),T(n)] 2), w®*)°, 


where M((P',(N),I'(n)]z) denotes the moduli scheme parametrizing triples 
(E/S/R, a, B) of generalized elliptic curves with a:(Z/NZ)s5 — E[N] and 
B:(Z/nZ)% — E[n] embeddings of group schemes such that the image of 
a+ meets all irreducible components of all geometric fibres, and where G 
is the group GL2(Z/nZ). This M([[i(N),I'(n)]p) is a smooth projective 
curve over R, but its fibres are not geometrically irreducible. 

In the special case R = C one easily shows that M(N,k)c is naturally 
isomorphic to the space of modular forms defined as certain holomorphic 
functions on the upper half plane Hl (see Chapter III, Section 1.5). 

One can show, for example by using the modular form A in M(1,12)p, 
that M(N,k)r = 0 for all k < 0. 

Let R — R’ be a morphism of Z[1/N]-algebras. Then we have a mor- 
phism of R’-modules M(N,k)r @r R’ — M(N,k)p. Such a morphism 
is not always an isomorphism (consider for example Z — F2, N = 1 and 
k = 1). However, it is an isomorphism when R — R’ is flat (use (1.2)). 

Over Z[[g]] one has the Tate curve Tate(q), which is equipped with a 
basis dt/t of wrate(q)/z[[q]- One has an isomorphism of groups: 


(1.3) (Z/NZ)” —+ Tate(a)[N](Q((a))), (4,8) > CHa’)? 


where Cy € @ is a fixed root of unity of order N. For Ra Z[1/N, Cn]- 
algebra one has the standard g-expansion map M(N,k)r — R[[q]] sending 
f to the series }°,59 @n(f)q” defined by: 


(1.4) f(Tate(q),1++ Gv) = | So an(F)a” | (at/t)*. 


n>0 
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This map is injective. The R-modules M(N,k)p can also be defined as 
follows. One replaces “generalized elliptic curve” in the definition we gave 
by “elliptic curve” and one demands that the g-expansions obtained from 
all points of order N of the Tate curve over Z[Cy]((q!/")) are power series 
in q'/% (a priori they are Laurent series). The R-module of cusp forms 
of level N and weight k is defined to be the submodule M°(N,k)r of 
M(N,k)pr of those f all of whose g-expansions have zero as constant coef- 
ficient. Equivalently, they are the forms that vanish on degenerate elliptic 
curves. 

Suppose that k > 1. The R-modules M(N,k)pr and M°(N,k)p are 
equipped with certain endomorphisms. For n > 1 one has the Hecke oper- 
ator T;,, defined in terms of isogenies of degree n. For a in (Z/NZ)* one has 
the diamond operator (a), induced by the automorphism of [['1(V) Rr] that 
sends (E'/S/R,a) to (E/S/R,aa). The action of these operators is given 
by the usual formulas in terms of g-expansions (see Chapter IIT, Sections 2.4 
and 2.5). 

In particular, one has ai(T,(f)) = a,(f). The construction of the T;, 
in this generality is a bit complicated, especially for k = 1. See for example 
[17, §4] for a construction of T, on M(N,1)y,. In general, one can construct 
the T;, as follows. It suffices to construct the T, with p prime. If p divides 
N then p is invertible in R and the construction is easy. So suppose that 
p does not divide N. In this case one proceeds as in [17, §4]. Take f in 
M(N, k)r, view it as a G-invariant section of w®* on M((P'\(N),T'(n)]R), 
as in (1.2). Then restrict it to M(([1(NV),T(n)|z), the complement of the 
cusps. This is affine, hence this restriction of f is a linear combination, 
with coefficients in R, of sections of w®* over M((['1(N),T'(n)]znwaj) and 
one knows what T, does with those. One verifies that the T,(f) obtained 
in this way is again G-invariant and regular at the cusps. 

The endomorphisms T, and (a) of M(N,k)p all commute with each 
other. For e a character of (Z/NZ)* with values in R* let M(N,k,€)p 
denote the R-submodule of M(N,k)r of elements f such that (a)(f) = 
e(a)f for all a; such f will be called forms of type (N,k,e). If R is a field 
and f a non-zero element of M(N,k) pr which is an eigenform for all T,,, then 
the formula a;(Tn(f)) = an(f) implies that the corresponding eigenspace 
for the T,, has dimension one, and that there is a unique character e€ such 
that f is of type (N,k,¢). 


Theorem 1.5 (Deligne) Let p be a prime, N an integer prime to p, and 
k>1. Let f be a non-zero eigenform in M(N, kK, )e and let a, be its 
eigenvalue for T,. Then there is a unique semi-simple continuous repre- 
sentation p#:Gq := Gal(Q/Q) — GLa(F,), unramified outside pN, such 
that for all primes | not dividing pN the (arithmetic) Frobenius element 
p7(Frob;) has trace a, and determinant Ethie. 
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(The topology on GL2(F,) in this statement is the discrete one.) In fact, 
Deligne [8] showed that for an eigenform f of weight k > 2 and with 
coefficients in Q, one even has a p-adic representation p; of Gg with values 
in GL2(Q,). The case k = 2 had already been treated by Shimura in [43]. 
A detailed and reasonably elementary proof of Theorem 1.5 can be found 
in [17]. 

Let f be as in Theorem 1.5. The fact that every elliptic curve has 
the automorphism —1 implies that e(—1)f = (—1)(f) = (—1)*f, hence 
that e(—1) = (—1)*. It follows that, for o in Gg any complex conjuga- 
tion, det(ps(o0)) = e(—1)(—1)*"' = —1. A representation of Gg with this 
property will be called odd. Serre conjectured [37, (3.2.3)7] that any con- 
tinuous odd representation p:Gg — GlLo(F,) is isomorphic to some Pf. 
Then of course the question arises how to see from p what the possible 
levels, weights and characters for such f are. So Serre also conjectured 
(37, (3.2.4)2] that for irreducible p, such an f exists of a certain “minimal” 
type (N(p), k(p),e(p)). Section 5 of [37] gives a number of examples where 
this conjecture can be at least partially verified. These examples concern p 
with values in GLo(F,) with g equal to 2, 3, 4, 7 and 9. In the first two 
cases, every p satisfies [37, (3.2.3)2]. Recently, Shepherd-Barron and Tay- 
lor [42] proved similar results for q equal to 4 and 5. In general, very 
little is known about [37, (3.2.3)2], but there has been a lot of progress on 
the question of whether [37, (3.2.3)2] implies [37, (3.2.4)7]. Let us mention 
that [37, §4] gives some spectacular consequences of [37, (3.2.4)7], including 
Fermat's Last Theorem and variants of it, and Shimura-~Taniyama-Weil. It 
seems that no work has been done on the problem posed in Remark 4 on 
page 197 of [37]. Before we can state Serre’s conjecture, we need some 
terminology. 


Let p be a prime and p:Gg — Glo(F,) a continuous representation. 
Then the image G of p is finite, let’s say equal to Gal(K/Q) with K a finite 
Galois extension of Q in Q. It follows that p is unramified at all but finitely 
many primes. The number N(p) is by definition the Artin conductor of p, 
except that one doesn’t take the prime p into account. More precisely, N(p) 
will be a positive integer prime to p. Let 1 be a prime number different 
from p. Let I; denote the inertia subgroup at / of G corresponding to a 
place of K above I, and let I; = [jo D [1,1 D --- be the higher ramification 
subgroups: J; ; is the subgroup of the decomposition group at the chosen 
place whose elements act trivially on the ring of integers modulo the (i+1)th 
power of the maximal ideal. The valuation of N(p) at 1, i.e., the number 
of factors of J in it, is then given by the formula: 


(1.6) nV) = ey imtv/v"), 


wo fhe ‘il 
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where V is a two-dimensional F,-vector space with a G-action giving p, and 
where V“ denotes its subspace of invariants under I, ;. For a motivation 
for formula (1.6), see [38], Section 19. Let us note, by the way, that replac- 
ing V/V" by the kernel of the map to the coinvariants V — V;,, would 
give the same result. For 7 > 0 this is clear since then J, ; is an I-group, for 
= 0 one uses that V7,, is obtained by first taking the coinvariants for the 
l-group J, ; and then the coinvariants for the cyclic group Ij,9/Ii,1. 

We are now able to define the character e(p) and the image of k(p) in 
Z/(p —1)Z. For each positive integer n let xn:Gq — (Z/nZ)* denote the 
cyclotomic character of the nth roots of unity, i.e., the character such that 
for all o in Gg and all x in Q with 2” = 1 we have o(z) = 2*). We 
will often view a character of the group (Z/nZ)* as a character of Gg, by 
composing it with x,. With these conventions, Theorem 1.5 implies that 
for f an eigenform in M(N,k, EF, one has detops = > ee This equality 
shows that detep; is a character of the group (Z/pNZ)*. This group is 
canonically isomorphic to (Z/NZ)* x F5 via the corresponding See 
of rings. Under this eee Oso the restriction of det op; to (Z/NZ)* i 
e, and the restriction to FF is xt —1_ Let us now go back to the oo 
tion p. By the definition of N(p), it is unramified away from pN(p). Since 
the maximal abelian extension of Q is the cyclotomic extension, det op can 
be considered as a character of (Z/p pra)" x (Z/N(p)™)*, for some n,m > 0. 
Since the character has values in F, , one can take n = 1. Comparing for- 
mula (1.6) for p and det ep, and asine some class field theory, one sees that 
one can take m = 1. Then one defines €(p) to be the restriction of det op to 
the factor (Z/N(p)Z)*, and the image of k(p) in F such that the restriction 


of det ep to FF is xn}, : 


Now we get to the exact definition of k(p). The first thing to note is 
that k(p) depends only on the restriction of p to some inertia subgroup 
I, C Ge at p. Hence the level N(p) reflects the ramification away from p 
and the weight k(p) the ramification at p. In order to state the definition 
of k(p) we first have to classify the two-dimensional representations over Fy 
of a decomposition subgroup G, C Gg at p (this subgroup G, depends on 
the choice of a maximal ideal containing p, but since these are permuted 
transitively by Gg the choice will not matter). For this classification, we 
need some terminology. 

Let Z be the integral closure of Z in Q. Let us choose a morphism of 
rings Z — F,. This gives us a decomposition group G, (the stabilizer of 
the kernel). The action of G, on F, gives a surjection G. — Gr,; its kernel 
[, is the inertia group. We can identify G, with Cal(Q, /Q,p) and I, with 
Gal(G, /Q2""), where QU" is the maximal unramified extension of Q,. Let 
QF be the maximal tamely ramified extension of Q,. Then the quotient 
Ip,g = Gal(QF/Q3™*) of Ip is called the tame ramification group, and the 
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subgroup Ip,w := Gal(Q,/Q*) of I, the wild ramification group. The tame 
extensions of Q5™ are obtained by taking nth roots of elements p, with n 
prime to p. It follows that J, can be identified with lim Fi., where the 


limit is taken over the n > 1 and where the transition morphisms are norm 
Trt : ‘ » 

maps. A character ¢:Ip, — F, is called of level n, if n is the smallest 

integer such that ¢ factors through Ft.. The n characters 


sk =* 
Fo = Fon _ FE, 


that are induced by embeddings of fields F,. — F, are called the funda- 
mental characters of level n. 

Suppose now that p,:G, — GL(V) is a continuous 2-dimensional rep- 
resentation on a Fy-vector space. Let V°* be the semi-simplification of 
V for the action of G,. we claim that I,., acts trivially on V*%*. To see 
that, let W be an irreducible factor of V°°. Then the subspace W/?~ of 
W is stable under G, because J,,, is a normal subgroup of it, so either 
W/»™ is trivial or it is W. Since the image of G, in GL(W) is finite, the 
representation of G, on W can be realized over a finite extension of Fp. 
For such a realization W’, the number of fixed points for Ip, is a mul- 
tiple of p since the image of I, is a p-group, hence W’ has non-trivial 
Ipw-invariants. This proves that the action of Ip on V* is given by two 
characters ¢, 9’: Int EF. Since Gal(F,/F,) acts by conjugation on Ip, 
it follows that {¢?, ¢’?} = {¢,¢’}. This means that there are two cases: 
either ¢ and ¢’ are both of level one, or ¢ and ¢’ are both of level two and 
g? = ¢’, ¢'? = ¢. In the first case p, is reducible, whereas in the second 
case pp is absolutely irreducible. 

We need one more ingredient before we can define k(pp): the notion 
of “finiteness at p”. Let F be a finite extension of F, such that p, can 
be realized over F. An F-vector space scheme V over a scheme S is 
then an S-scheme V, together with a structure of F-vector space on all 
the sets of points V(T’) for S-schemes T,, functorially in T. Equivalently, 
an F-vector space scheme over S is a commutative group scheme over S 
given with an action of F on it by endomorphisms. An F-vector space 
scheme V over Q, gives rise to the representation of G, on the F-vector 
space V(Q,). This construction induces an equivalence between the cat- 
egory of F-vector space schemes finite over Q, and the category of rep- 
resentations of Gp on finite dimensional F-vector spaces. Concretely, the 
F-vector space scheme Vg, corresponding to pp can be obtained as the quo- 
tient of the scheme FG, X spec(@,) Spec(Q,) by the group Gp, with o in Gp 
acting as (~p(o)~', Spec()). One says that pp is finite at p if Vg, can be 
extended to a finite flat F-vector space scheme over Zp. It is equivalent to 
demand that Vg, can be extended to a finite flat F-vector space scheme 
over the ring of integers Z>™ in Q,", or to demand that it can be extended 
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over Zp or Z)™ as a finite flat group scheme. Finally, one can formulate 
the condition that pp be finite at p in Galois theoretic terms, without using 
group schemes (see [15, Proposition 8.1], and its proof, and [37, (2.4.7)]). 

We will now give the definition of k(pp). Let w and w’ be the two 
fundamental characters of level 2. 


Definition 1.7. Let p,:G, — GL(V), V®, ¢ and ¢’ be as above. We 
associate an integer k(p,) to pp as follows. 


1. Suppose that ¢ and ¢’ are of level 2. We have: 


poli, { 4 ie 


After interchanging ¢ and ¢’ if necessary, we have (uniquely) ¢ = 
pote — pty!’ and ¢! = wy? withO <a<b<p—l. We set 
k(pp) =1+pa+t+o. 


2. Suppose that ¢ and ¢’ are of level 1. 


(a) If pp|z,.,, is trivial, then we have: 


mf C8 


withO<a<b<p—2. Weset k(p,)=1+pa+b. 
(b) Suppose that pp|7,., is not trivial. We have: 


B 
~{ X * 
role, & (75 =) 


for unique a and 6 withO<a<p—2and1<G@<p-—l1. We 
set a = min(a, 8), b = max(a, B). If y¥°-* = x and pp @ x is 
not finite at p then we set k(pp) = 1+pa+6+p-—1, otherwise 
we set k(pp) =1+pa+ b. 


O 


For p:Gg — GLo(F,) continuous, we define k(p) to be k(pp), where py is 
the restriction of p to a decomposition group at p. 


Conjecture 1.8 (Serre) Let p be a prime number and p: Gg — GL2(F,) 
a continuous irreducible odd representation. Then there exists an ergenform 
f in M°(N(p), k(p), E(p))g, such that p is isomorphic to py. 
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Several remarks should be made at this point. First of all, it is easy to 
see that for f an eigenform which is not cuspidal, the representation p; 
is reducible (restrict f to the cusps, and study the action of the Hecke 
operators on the cusps). This explains that in the conjecture, one can 
demand that f be a cuspform. 

The second remark is more important. The conjecture, as we state it 
here (i.e., as suggested in [15, 4.3]), is not equivalent to the one stated 
by Serre in [37, (3.2.4)7]. The difference comes from the fact that the 
modular forms over F,, considered in [37] are defined to be those obtained 
by reduction mod p of forms over Q. More precisely, in [37], a cuspidal 
modular form over F, of some type (N,k,¢), with N prime to p and k > 2, 
is an element in the image of the map Mf °(N,&, €)z, — MN, ke}, 
where Z, is the integral closure of Zp in Q, and &(Z/NZ)* > Ze is the 
Teichmiiller lift of €, i.e., € induces « and they have the same order. The 
problem is that these reduction maps are not all surjective. Before we 
discuss what is known today about Serre’s conjecture, we will discuss the 
differences between Conjecture 1.8 and [37, (3.2.4)2]. It was suggested by 
Serre in [40] to replace the mod p modular forms in [37] by those defined 
by Katz, i.e., the ones we are using here. See also [41]. 

Let us first consider the problem of lifting modular forms from F, to Zp, 
without paying attention to the character. Then we have the following 
result. 


Lemma 1.9 Let p be a prime, N > 1 prime to p. 


1. Suppose that k > 2. Then the map M°(N, k)z, > M°(N,k)g is 
surjective if N #1 or if p> 3. 


2. The map M°(1,k)z, + M®(1,k)g, ts not surjective if and only if 
k > 12 and (k =1 mod 2 or k = 2 mod 12). 


3. The map M°(1,k)z, = M°(1, k)g, is not surjective if and only if 
k > 12 and k = 2 mod 12. 


Proof. Let us prove the first statement; the other two can be proved 
using the explicit descriptions of the rings of modular forms of level one 
over Z, F2 and F3 found in [9, Proposition 6.2]. Suppose first that N > 5. 
Because Z, is flat over Z, and F, is flat over Fp, it suffices to prove that the 
reduction map induced by Zp — F, is surjective. Consider the long exact 
cohomology sequence arising from the short exact sequence of sheaves on 
Xx 1 (N as: 


(1.9.1) 0 — w®*(—cusps) — w®*(—cusps) — i,w®*(—cusps) — 0, 
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where the map w®*(—cusps) + w®*(—cusps) is multiplication by p and 
i: Xi(N)r, > X(N )z, denotes the closed immersion. To get the surjec- 
tivity we want, it is sufficient to show that H!(X1(N)g,,w®*(—cusps)) = 0, 
since by Nakayama’s lemma and the long exact sequence this implies that 
H1(X1(N)z,, w°*(—cusps)) = 0. The Kodaira-Spencer isomorphism (see 
[22, Al.4]) and Serre duality give isomorphisms: 


H! (X1(N)p, ,w®*(—cusps)) H(X1(N)p,, 2 @ w?*~?) 


H°(X1(N)p,, we2-*)Y, 


So if k > 2, this shows what we want, since the degree of w is positive. The 
case k= 2 is in fact easy, since the Kodaira-Spencer isomorphism identifies 
weight 2 cuspforms with differential forms, and those can be lifted. Another 
way to phrase the argument is to say that the dimensions of M°(N, k)r,, 
and M°(N,k)g, are given by the Riemann-Roch formula since the H'-term 
vanishes, and that hence the reduction map is surjective. 

Suppose now that p > 3. Then 


M°(N,k)p, = H°(M((Pi(N),1(3)]e,), w2*(—cusps))%, 


with G = GLo(F3). Since p does not divide the order of G, the functor 
M ++ M® from Z,[G]-modules to Zp-modules is exact. Combining this 
with the long exact sequence arising from the short exact sequence 


0 — w®*(—cusps) + w®*(—cusps) — i,w®*(—cusps) — 0 


on M([C1(N),T(3)])z, gives the result. 

Suppose now that N = 2 or N = 4. Then p # 2. In these cases, 
the category [[1(V)]z, is the quotient, in the sense of algebraic stacks, for 
the action of a subgroup G of Gl2(Z/4Z) acting on [I'(4)]z,. This gives a 
formula analogous to (1.2). The group G is a 2-group, hence of order prime 
to p. One can then apply an argument which is similar to the one used in 
the case p > 3 above. 

Suppose now that N = 3. Then p # 3. Let Z,[¢3] be the subring of 
Zp generated by Zp, and a third root of unity C3. Let [I'(3)S9-"]z, (c,] be 
the category of generalized elliptic curves over schemes over Z,[¢3] with an 
embedding a of the constant group scheme (Z/3Z)?, such that the Weil 
pairing of a(1,0) and a(0,1) equals ¢3. Then [['1(N zp {cal is the quotient 
of [['(3)9—-" Jz, 19] for the action of a group of order 3. This means that 
one can again use the same argument. O 


The proof of this lemma indicates that the case k = 1 is very different, since 
the degree of w is, as one sees from the Kodaira-Spencer isomorphism, too 
small to make the H!-term in the Riemann-Roch theorem vanish. Mestre 
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has indeed found examples with p > 3 where the map M°(N l)jz, — 
M°(N, 1)r, is not surjective. In these examples one has an eigenform f in 
M°(N, I), such that the image of the representation py is too big to be 
embeddable in GL2(C); if p could be lifted to characteristic zero this would 
contradict the theorem of Deligne-Serre (Theorem 4.1 of [11]). Let us note 
that for a representation p as in Conjecture 1.8 it can very well happen that 
k(p) = 1; one can check that this is equivalent to p being unramified at p. 
This explains that the weight k, for f that one finds in [37, (3.2.4)z], is not 
in all cases the same as k(p) defined above. The difference between k, and 
k(p) can be summarized as follows, in the notation of Definition 1.7. There 
are only two cases where they are different; in both cases the characters 
@ and @’ are of level 1. In the first case, the restriction of p to the wild 
inertia group Ip is trivial and a = 0 = 6b; then k(p) = 1 and k, = p. In 
the second case p = 2, p is wildly ramified at 2, a= 0, G = 1 and p is not 
finite at 2; then k(p) = 3 and k, = 4. 

Other problems arise if we take the character into account. In his course 
at the Collége de France, 1987-1988, Serre gave some counter examples 
against his conjecture [37, (3.2.4)7]. These examples are found by consid- 
ering the genus two curve X1(13). On this curve there are two eigenforms 
of weight 2 over Z[¢3], and the two corresponding characters are of order 6. 
The reductions mod 2 and 3 of these eigenforms have characters of order 
3 and 2, respectively. One verifies that the Galois representations corre- 
sponding to these mod 2 and mod 3 forms are irreducible, and that the 
weights associated to them equal two (Definition 1.7 and [37, §2] coincide 
in these cases). In fact, these representations are dihedral, induced from 
Gavyaty and Gg,=a), respectively. According to [37, (3.2.4)2], the mod 2 
and mod 3 reductions of the two eigenforms should have lifts to weight two 
eigenforms in characteristic zero on X,(13) with a character of the same 
order as the reduction. But since the genus is two, there are no such forms. 

In the same course at the Collége de France, Serre showed that the 
only mod p eigenforms f of weight at least 2 that cannot be lifted to an 
eigenform with the same level, weight and order of character are among 
those in characteristic 2 or 3, whose representation py is induced from 
Gery=) OF Ga, J=3) Tespectively. This result was obtained independently 
by Carayol, see [4, §4.4], and is usually called Carayol’s Lemma. Serre’s 
proof uses a result of Nakajima implying that for n > 3 prime to p and k > 2 
the Z,)[GL2(Z/nZ)|-module H°(X(n)z,, w®*) is projective, if all stabilizers 
are of order prime to p. 

Carayol’s proof uses the realization of the Galois representation asso- 
ciated to modular forms in the first cohomology group of certain p-adic 
sheaves on modular curves over Q. His arguments can be adapted to the 
sheaves w®*, This gives the following result, that we state without proof, 
and which can also be found in Serre’s notes. Yet another version of it can 
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be found in [12, §2]. 


Proposition 1.10 (Carayol’s Lemma) Let N > 1, p a prime not di- 
viding N, and k > 2. Let e:(Z/NZ)* = Zp be a character with e(—1) = 
(—1)*, and let €:(Z/NZ)* <> FE, be its reduction. Consider the map 
ob: M°(N,k,€)z, — MN, k,é)g- Ifp 25, then @ is surjective. If p = 3 
(resp., p = 2) and f € M°(N,k,é)g is an eigenform with ps irreducible 
and f not in the image of 6, then py is induced from Q(./—3) (resp., 
Q(V-1)). 


In both proofs of this result it is quite clear where the Q(./—3) and Q(/—1) 
come from. Suppose for simplicity that N > 5. In Carayol’s proof, it comes 
from the fact that the points of X9(N)g with an automorphism (i.e., an 


automorphism of the pair (E/Q, G) corresponding to it) of order 3 (resp., 
order 4) are defined over abelian extensions of Q(./—3) (resp., Q(./—1)). 
In Serre’s proof, it comes from the fact that for primes 1 = —1 mod 3 (resp., 
mod 4) there is no elliptic curve with an automorphism of order 6 (resp., 
4) fixing a subgroup of order /, implying that if an eigenform is not liftable 
(in the sense of Proposition 1.10) then it is annihilated by T; for such J; 
this implies that the character of p; vanishes on Frob; for such /; then it 
follows that py is induced as stated. 

The statement of Carayol’s Lemma in [4] is actually different from 
Proposition 1.10. It says that if an irreducible representation p:Gg — 
GL2(F,) arises from some eigenform f in M(N,k, e)g, with k > 2, then for 
every character e’: (Z/NZ)* — Q, inducing the same character (Z/NZ)* > 
EF, as €, there exists an eigenform f’ inducing p. In this statement, which 
does not speak of modular forms mod 7p, one does not suppose that N is 
prime to p. 

Let us now discuss what is known about Conjecture 1.8 and about its 
relation to [37, (3.2.4)9]. 


Proposition 1.11 Let p be a prime, and let p:Gg — GLo(F,) be contin- 
uous, irreducible and odd. 


1. Ifp = 2 (resp., p = 3), suppose that p is not induced from Q(/—1) 
(resp., Q(/—3)). Then if p satisfies Conjecture 1.8, it satisfies (37, 
(3.2.4)2/. 


2. If p = 2, suppose that the restriction of p to a decomposition group 
at 2 is not an extension of some character by itself. Then if p satis- 
fies [87, (3.2.4)2/, it satisfies Conjecture 1.8. 
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Theorem 1.12 Let p > 2 be a prime, and let p:Gg - GL2(F,) be con- 
tinuous, irreducible and odd. Suppose that p comes from a modular form 
of some type. Then p satisfies Conjecture 1.8. Moreover, if p comes from 
a mod p modular form of some type (N,k,€) with N prime to p, then N 1s 
a multiple of N(p), k > k(p) and € is obtained from e(p) via composition 
with Z/NZ — Z/N(p)Z. 


The proof of these results is quite long and many people have contributed 
to it. A complete proof can be found by reading Diamond’s article [12], 
and the references therein. A very good overview of the strategy of the 
whole proof is given in Ribet’s report [32]. In the next section we will 
see which parts of these results are used in the proof of the conjecture 
of Shimura~Taniyama-—Weil for semi-stable elliptic curves over Q and the 
proof of Fermat’s Last Theorem. In Sections 3 and 4, we will then describe 
the proofs of those cases. To finish this section, we will briefly recall the 
history of the proofs of Proposition 1.11 and Theorem 1.12. 

For the rest of this section, let p be prime and p:Gg — Glo(IF,)-be 
continuous, irreducible and odd. We will say that p is modular of type 
(N,k,€)g, (resp., (N,k,e)g ) if there is an eigenform f in M°(N,k,€)g 
(resp., M°(N, k, E)E ) such that f gives the representation p. Serre formu- 
lated, in a letter to Mestre dated August 13, 1985, a part of his conjectures 
that, together with the Shimura-~Taniyama—Weil conjecture, implies Fer- 
mat’s Last Theorem. Mazur proved, in a letter to Mestre dated August 16, 
1985, the following result: suppose moreover that p > 2, that p is modular 
of some type (N, 2, 1g that J is a prime not congruent to 1 mod p, that 


I divides N but I? does not, that p is unramified at I if 1 #4 p and that 
p is finite at p if 1 = p; then p is modular of type (N/I, 2, 1g In 1987, 
Ribet removed the condition “J # 1 mod p” from Mazur’s result, under 
the assumption that p does not divide N (see [33]). These two results 
together imply already that Fermat’s Last Theorem is a consequence of 
the Shimura-Taniyama-Weil conjecture. Together with Mazur [29], Ribet 
extended his result to the case where p divides N, but where p? does not. 

Langlands, Deligne and Carayol have proved [5] that for f a newform the 
conductor of the system of l-adic representations associated to f is equal 
to the level of f. From this it follows easily that for f an eigenform in 
some M°(N,k, Eg, and p; the mod p Galois representation that it gives, 
N(p;) divides N. Carayol [4] and Livné [26] classified, independently, in 
terms of the admissible irreducible representation of GL2(Q,) associated to 
a newform f in some M°(N, Kee. the cases where the I-adic valuations 
of N(ps) and N are different (here p; denotes the mod p representation 
associated to f, py is supposed to be irreducible and / is a prime different 
from p). Carayol [4] showed that if p is modular of type (N,k, e)g, and not 
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induced from Q(./—1) (resp., Q(./—3)) if p = 2 (resp., p = 3), then it is 
modular of type (N, k, €’ g, for all e’ whose mod p reduction equals that of 
e and such that e/(—1) = (—1)* (this last condition is implied by the first 
if p # 2). This result shows that in order to prove Proposition 1.11 and 
Theorem 1.12 one need not pay attention anymore to the character, so we 
will drop it from the notation in what follows. 

Suppose now that p is modular of some type (JN, kg Then one wants 


to prove that p is modular of type (N(p),k). In [4] Carayol reduces the 
proof of this, for p > 5 and k > 2, to the following two statements: 


(A) There exists a prime number q not dividing NI and a newform of 
type (N"q,k,e’)g with N’ dividing N and e’ trivial mod q, whose 
p 
associated mod p Galois representation is isomorphic to p. 


(B) If 1 # p divides N, I? does not divide N and I does not divide N(p) 
(ie., p is unramified at /), then p is modular of type (V/I,k)g - 


The first of these two statements is used to switch in certain cases from 
modular curves to Shimura curves associated to indefinite quaternion alge- 
bras over Q, via the Jacquet—Langlands correspondence. The main part of 
Ribet’s article [32] is about establishing some geometric integral version of 
this correspondence in the case of weight two and trivial character. State- 
ment (A) for weight two and trivial character was proved first by Ribet 
in [34] and more generally by Diamond for 2 < k < p+1 and arbitrary 
character in [13]. Note that statement (B) for weight two, trivial character 
and p” not dividing N, is the result of Mazur and Ribet above. A crucial 
point in their proof is that p has multiplicity one in the p-torsion of the 
jacobian Jo(N), in some sense (see Section 3.3). A semi-simplicity result 
in [2] made it possible for Ribet to prove statement (B) for weight two and 
trivial character, but without the condition that p? does not divide N (see 
[35] and [32]). In [15] it was shown, using work of Gross [17] and of Cole- 
man and Voloch [7], that one can always adapt the weight, in the following 
sense: if p #4 2 and p is modular of some type (N,k, E)R, with N prime to 
p, then p is modular of type (N, K(p), €)g- The definition of k(p) makes 
it clear that the mechanism behind the proof of this result was known to 
Serre; in Section 4 we will discuss a part of it. This mechanism includes the 
fact that if p is modular of some type (N,k, e)R,» with N prime to p, then 
for some integer a, p@x¢ is modular of type (N, k’, E)g, with 2 < k’ < p+, 
and p is modular of type (Np’, 2)g,- It follows from this that in order to 


prove statement (B), one may assume that the weight is two. This is used 
in [32] to show statement (B) for p > 3 and p with det(p) = xp; Ribet also 
remarks that he expects his proof to extend without difficulty to det(p) 
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arbitrary. Finally, Diamond [12] proved statement (B) for p > 3, following 
[32]. Another proof of statement (B), not using the reduction to weight 
two, but extending the arguments of [33] to weights k between 2 and p+1, 
was suggested by Jordan and Livné in [21]. The multiplicity one result 
needed for that is proved in [16]. 


2 The Cases We Need 


Special cases of Theorem 1.12 are used at three different places in the proof 
of the Shimura-Taniyama—Weil conjecture and of Fermat’s Last Theorem. 
First of all, Ribet’s proof that the Shimura-Taniyama—Weil conjecture im- 
plies Fermat’s Last Theorem is a special case of Theorem 1.12. We briefly 
recall the situation. One supposes that Fermat’s Last Theorem is not true. 
Then there exist a prime p > 3 and non-zero integers a, b and c that are 
pairwise relatively prime and satisfy a? +b? +c? = 0. This leads, via a con- 
struction of Hellegouarch (see [19] and [20]), to a semi-stable elliptic curve 
E over Q that is usually called the Frey curve associated to (a?, b?, c?). Let 
Pp be the representation Gg — GLo(F,) given by the p-torsion of E. It 
follows from Mazur’s work on isogenies between elliptic curves over Q (see 
[27] and [28]) that p, is irreducible. Moreover, F has the miraculous prop- 
erty that pp is unramified away from 2p and that its ramification at 2 and p 
is very well-behaved: one has N(pp) = 2, k(pp) = 2 and e(pp) = 1; see [37, 
§4]. The conductor N of E is the product of all primes dividing abc; note 
that it is square free. If EF is modular, i.e., if the Shimura—Taniyama—Weil 
conjecture is true for H, then pp is modular of type (N,2,1)g,. So in this 
case it suffices to have Theorem 1.12 for pp that are modular of some type 
(N, 2, Ne, with N square free and with p > 3. 

Let us now look where Theorem 1.12 intervenes in Wiles’s proof of the 
Shimura—Taniyama—Weil conjecture for semi-stable elliptic curves. One 
has the following proposition, that was given by Oesterlé in the seminar 
held in Paris on the work of Wiles. 


Proposition 2.1 Let E be a semi-stable elliptic curve over Q, p be a prime 
number and pp:Gg — GLo(F,) the representation obtained by choosing 
some basis of the p-torsion of E. Then either pp is surjective, or it 1s 
reducible and its semi-simplification is isomorphic to the direct sum 1@® xp 
of the trivial character with the cyclotomic character. 


Proof. Ifp > 5 or if the image G of pp has order divisible by p, this is part 
of Proposition 21 of [39]. Suppose that we are not in this case. We use the 
same arguments as Serre does. If p, is reducible Serre’s argument shows 
that its semi-simplification is 1 @ xp. So we suppose that pp is irreducible 
and we have to derive a contradiction. The morphism det:G — FF is 
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surjective because det opp = Xp. We have p = 2, 3 or 5 and pp is unramified 
outside p. The image under p, of inertia at p is a non-split Cartan subgroup 
(i.e., cyclic of order p? — 1) or “half” of a split Cartan (i.e., a conjugate 
of the subgroup of elements of the form (5 we The first case arises if and 
only if E has good, supersingular reduction at p. 

Suppose that p = 2. Then G has order 3 since pp is irreducible. It 
follows that Q has a Galois extension of degree 3 which is ramified only at 
2; there is no such extension. This finishes the proof for p = 2. 

Suppose that p = 3. Then G has order dividing 16. The absolute 
irreducibility of pp implies that G has order 8 or 16. But then G has a 
quotient of order 4, leading to a Galois extension of Q of degree 4 which is 
unramified outside 3; again, such an extension does not exist. 

Suppose that p = 5. By §2.6 of [39], G is contained in the normalizer 
of a Cartan subgroup or the image G of G in GL2(F,)/F* is isomorphic to 
Aa, S4 or As. Since G is of order prime to 5, we cannot have G isomorphic 
to As. If G is isomorphic to A, or S4 we get a Galois extension of degree 3 
of Q or of Q(V/5) which is unramified outside 5, and such extensions do not 
exist. Hence G is contained in the normalizer N of a Cartan subgroup C. 
The absolute irreducibility of pp implies that G is not contained in C. This 
gives us a degree 2 extension K of Q. We claim that K is unramified, which 
gives us the desired contradiction. The extension K is clearly unramified 
outside p, so we have to show that pp(Jp) is contained in C. If pp(Ip) is a 
non-split Cartan subgroup then this Cartan subgroup has to be C (look at 
the action on the projective line), hence we get what we need. If p,(I,) is 
“half” of a split Cartan subgroup, then again p,(Ip) has to be contained 
in C’ because of its action on the projective line. im 


Let E be a semi-stable elliptic curve over Q. If for a prime number p 
the representation pp is not surjective, then # admits a Q-rational isogeny 
of degree p. Hence, if p3 and ps5 are both not surjective, E admits a Q- 
rational isogeny of degree 15, hence defines a Q-rational non-cuspidal point 
of X9(15). According to the tables in [1], X9(15) is an elliptic curve with 
exactly eight rational points, four of which are cusps. The four non-cuspidal 
points correspond (up to twist) to elliptic curves of conductor 50 of which 
one can easily see that no twist of them is semi-stable. So we conclude that 
at least one among p3 and ps is surjective. 

Suppose that p3 is surjective. Results of Langlands [25] and Tunnell 
[46] show that p3 is modular of type (3 N(p3)", 1)g,, for some m > 0. 
This will be discussed in Section 4. In that same section, we will show that 
p3 is then modular of type (V(3),2,1)g, if ps is finite at 3, and of type 
(3N (93), 2, 1)g, if pg is not finite at 3. We will not need the results on com- 
panion forms, and we can use an older version of Carayol’s Lemma dating 
back to Mazur’s [28]. Carayol’s reductions in [4] imply that p3 is modular 
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of type (N(p3), 2, 15, if p3 is finite at 3, and of type (3.N(p3), 2, Le, if 
p3 is not finite at 3. Starting at this point, Wiles and Taylor prove (see 
[47] and [45]) that all deformations of p3 that are ramified at only finitely 
many primes, that are “semi-stable” at 3 and whose determinant is the 
cyclotomic character are modular. Hence E is modular. (Let us note that 
the restriction of p3 to Gal(Q/Q(/—3)) is absolutely irreducible since its 
image is SL2(F3).) So in this case we don’t need Ribet’s part of the proof 
of Theorem 1.12. 

Suppose that 3 is not surjective. Then ps5 is surjective. In this case, 
Wiles [47] shows that there exists an elliptic curve E’ over Q such that its 
representation p is irreducible and with p{ isomorphic to ps (this follows 
from the fact that X{5) is a disjoint union of four projective tines). ‘The 
curve E’ is semi-stable since this can be read off from ps5, even at the 
prime 5 itself. The previous argument shows that E’ is modular, hence 
Ps is modular of type (N’,2,1)g,, where N’ is the conductor of E’. Note 
that N’ is square free. Ribet’s part of the proof of Theorem 1.12 shows 
that ps is modular of type (NV (95), 2, 1)@, if ps is finite at 5, and of type 
(5N (ps), 2, 1)5. if ps is not finite at 5. Wiles and Taylor ([47] and [45]) 
show that all deformations of ps that are ramified at only finitely many 
primes, that are “semi-stable” at 5 and whose determinant is the cyclotomic 
character are modular. Hence F& is modular. 

The final conclusion of this section is the following. We need Ribet’s 
part of the proof of Theorem 1.12 in the case where p > 3 and p is mod- 
ular of some type (N, 2, Yo, with N square free. We also need Carayol’s 
reductions for p = 3 and p modular of type (N%, 2, 1)g, with N square free. 
It should be noted that Carayol’s reductions use Mazur’s result, so in the 
next section we will explain that result for p > 3. 


3 Weight Two, Trivial Character and 
Square Free Level 


The aim of this section is to describe a proof of the following theorem. 


Theorem 3.1 Let p > 3 be prime. Let p:Gg — Glo(F,) be irreducible 
and modular of some type (N, 2, lg with N square free. If p is finite at 
p, it is modular of type (N(p), 2, lg . If p is not finite at p, it is modular 
of type (pN(p), 2, 1g 

The proof of this theorem has two parts. The first part, due to Mazur, deals 


with the primes | dividing N/N(p) that are not congruent to 1 modulo p. 
The second part, due to Ribet, deals with an arbitrary | # p at the cost 
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of-introducing a prime q in the level that one can get rid of by the first 
part. These two results will be described in the following two sections. For 
a detailed proof of in fact more general results we refer to [32, §6—-§8]; our 
aim is just to give a good idea of what happens in [32]. 


$.2 Mazur’s Result 


Theorem 3.2.1 Let p > 3 be prime. Let p:Gg — GLo(F,) be irreducible 
and modular of some type (N, 2, lg . Suppose that l is a prime not con- 
p 


gruent to 1 mod p, that | divides N but I? does not, that p is unramified 
at l if 1 4 p and that p is finite at p if 1 = p. Then p is modular of type 
(NA, 2, Ye - 


The proof of this result is by contradiction: let M := N/I and suppose 
that p is not modular of type (M, 2, g,- Let Xo(N) denote the model 
over the localization Zy) of Z at | which is constructed in [10] as a coarse 
moduli space (see also [24]). Let Jo(N)g be the jacobian of Xo(N)g, and 
let Jo(NV) be its Néron model over Ziq). Then Jo(NV) represents in fact 
the degree zero part of the relative Picard functor of X9(N) over Zy by 
[30]; this implies that one can describe Jo(V) in terms of Xo(V). By [10, 
V-VI], the fibre Xo(.V)p, of Xo(NV) over F; is the union of two copies of the 
smooth curve Xo(M)p,, which intersect transversally at the supersingular 
points. This gives the following “devissage” of Jo(V)p,. One has the con- 
nected component of the identity element Jo(N)p, of Jo(N)g,. The quotient 
Jo(N)z,/Jo(N)p, is a finite etale group scheme over F,, which is in fact a 
constant group scheme that we will denote by ®o(N)r,. By construction 
we have an exact sequence: 


(3.2.2) 0— Jo(N)E, _ Jo(N)r, —_ Oo(N)r, — 0. 


The normalization map Xo(M)p, [] Xo(/)r, — Xo(N)sg, induces another 
short exact sequence: 


(3.2.3) 0— To (N)p, aa Jo(N)e, ars Jo(M)z, re 0, 


where To(N)p, is a torus whose character group Homg (To(N)g, , Gmg,) 
can be identified with the group of degree zero divisors on Xo(M be with 
support in the supersingular points. Let To(NV) be the Hecke algebra of 
level N: it is the subring of End(Jo(V)g) generated by the Hecke operators 
T,,n > 1. By the Néron property of Jo(V), To(V) acts on it. This action 
can be described explicitly by correspondences over Zi). The short exact 
sequences (3.2.2) and (3.2.3) are compatible with the action of To(JV). 
Let us now look what all this has to do with our representation p. The 


fact that p is modular of type (N, 2, lg means that there exist a maximal 
p 
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ideal m of To(NV), an embedding of k := To(N)/m into F,, an integer d > 1 
and a 2-dimensional k-vector space V with an action by Gg with F, ®, V 
giving p, such that Jo(N)(Q)[m] is isomorphic to the direct sum V@ of d 
copies of V as k[Gg]-modules. (Here Jo(N)(Q)[m] denotes the kernel of 
m, i.e., the elements x such that tr = 0 for all t in m.) The fact that 
Jo(N)(Q)[m] is semi-simple is proved in [2]. 

We will now first treat the case where | 4 p, which is technically simpler 
than the case 1 = p. So we suppose that | # p. Then p is unramified at | 
by hypothesis, and there is a unique finite etale k-vector space scheme W 
over Za) such that V = W(Q) as k[Gg]-modules. We choose an injection 
of V into Jo(N)(Q)[m]. This gives us an injection of Wg into Jo(N)g. The 
Néron property of Jo(V) implies that this injection extends to a morphism 
W — Jo(N), which must be injective since Jo(V)[p] is etale. Consider the 
image of Wp, in ®o(N)g, under (3.2.2). It was proved in [33] that the action 
of To(V) on ®o(N)r, is “Eisenstein,” in the sense that for g prime to N the 
operator T, acts as multiplication by q+ 1. Since p is irreducible, it follows 
that the image of Wg, in Go(N)r, is zero. Hence Wg, lands in Jo(N)f,. 
Since we suppose that p is not modular of type (M, 2, lg, the image of 


Wr, in Jo(M)j, is zero. So Wp, lands in the torus To(N)p,. This has strong 
consequences for the Frobenius element p(Frob;). Namely, Frob; acts on 
To(N)r, (F,) simultaneously as IT; and as —lw;, with w; the Atkin-Lehner 
involution of level |. This implies that p(Frob;) is in k* and it follows that 
| = det(p(Frob;)) = p(Frob7) = 1?. This contradicts the assumption that 1 
is not congruent to 1 modulo p. 

Let us now assume that 1 = p. We have the information that p is finite 
at p. Let Wg be finite k-vector space scheme over Q such that W9(Q) 
gives V. Then W can be extended to a finite flat k-vector space scheme 
over Zp). Such an extension is unique by results of Raynaud in [31] (here 
we use that p 4 2). We choose an injection of V into Jo(N)(Q)[m]. This 
gives an injection of Wg into Jo(N)g. Ribet showed in (33, Lemma 6.2] 
that this injection extends to an injection of W into Jo(N). Note that this 
is not just a consequence of the Néron property of Jo(N), since W is not 
smooth as det(p) = xp- Once one has this result the rest of the proof of 
Theorem 3.2.1 proceeds as in the case 1 # p. So it remains to explain the 
proof of (33, Lemma 6.2]. We may replace Zp) by its completion Z,. Then 
W sits in a short exact sequence: 


(3.2.4) 0-W Ww 0, 


where W° denotes the connected component of the zero section of W, and 
where W* is the largest finite etale quotient of W. Grothendieck proved, 
see (18, 7, IX, §11.6], that the action of Gg, on Jo(N)(Q,)/Jo(N)°(Zp) is 
unramified. This implies that the injection of Wg into Jo(N)o, extends 
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to a morphism of W° into Jo(N)°, which is a closed immersion by [81]. 
Then one can finish by applying the Néron property to the amalgamated 
sum (W @ Jo(N)°)/W® (this trick is due to Grothendieck, see [18, 7, IX, 
5.9.2]). 


3.3 Ribet’s Result 


Theorem 3.3.1 Let p > 3 be prime. Let p:Gg — Glo cr) be irreducible 
and modular of some type (N, 2, lg . Suppose that | # p is a prime, that l 


divides N but I? does not, and that p is unramified at 1. Then there exists 
a prime number q, not dividing N and congruent to —1 mod p, such that 
p is modular of type (qN/I, 2, 1g : 

p 


We define M := N/I. Let q be a prime number not dividing Np with the 
property that p(Frob,) = p(c), with c a complex conjugation. This implies 
that q is congruent to —1 modulo p. In what follows we will consider the 
modular curves X9(Mlq)g, Xo(MDg, Xo(Ma)g, and a certain Shimura 
curve Cg that we will now define (see [33, §4] for more details). Let Bg 
be a quaternion algebra over Q of discriminant Ig and let B be a maxi- 
mal order in it. Then Cg is the Shimura curve associated to Bg of level 
I'9(M). More precisely, this means that Cg is the coarse moduli scheme for 
objects (A/S,a,G), with S a Q-scheme, A/S an abelian scheme of relative 
dimension two, with a: B — Endgs(A) a morphism of rings andG C Aa 
finite flat closed subgroup scheme of rank M?, which is killed by M and 
which is stable for the action by B. We define Jg to be the jacobian of Co. 
From the moduli interpretation of Cg it is clear that one can define Hecke 
correspondences on Cg, inducing endomorphisms T,,, n > 1, of Jg. Let T 
be the subring of Endg(Jg) generated by these T,. 

We have the usual pairs of degeneracy morphisms from Xo(Mlq)g to 
Xo(Ml)g and to X9(Mq)g, inducing a morphism 


The lg-new subvariety Jo(M ng of Jo(Mlq)g is defined as the con- 
nected component of the identity element of the kernel of this last mor- 
phism. One knows that Jg is isogeneous to Jo(Mlq)g? "™” (this results 
from trace formula calculations by Eichler, Shimizu, Jacquet—Langlands 
and Faltings’s isogeny theorem). Ribet has given a more precise version of 
this in terms of the character groups of the torus parts of the reductions 
mod | and q of the jacobians of the curves under consideration. For G a 
commutative algebraic group over a field k, let X(G) := Hom;(T%, Gmg) 
be the character group scheme of the maximal torus T; of Gg. Then Ribet 
constructed a short exact sequence: 


(3.3.2) 0— X (Jr, ) — X (Jo(M1q)r, ) — X (Jo(MDi, ) —0 
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which is Hecke-equivariant in the sense that for each n > 1, the element T,, 
in To(Miq) induces the element T,, of T on X(Jp,). The induced action of 
T, on X(Jo(M1)Z,) can be described in terms of a two by two matrix with 
coefficients in Tp(M1). Since | and q play symmetric roles, we also have 
the following exact sequence: 


(3.3.3) 0— X(Jg,) > X(Jo(Mlq)r,) > X(Jo(Maq)z,) — 0. 


To construct these sequences, Ribet relies heavily on work of Cerednik, 
Drinfeld and Jordan—Livné concerning the q-adic uniformization of Cg,. A 
detailed account of this uniformization can be found in [3]. An amazing 
feature of these sequences is that they compare character groups of tori_over 
fields of distinct characteristics. Since Jt, is its maximal torus, it follows 
from (3.3.2) that To(Mlq) acts on Jg via a (necessarily unique) morphism 
of rings To(Mlq) — T that sends T;, to T,. 

Let q be the element T? — 1 of To(Mlq), and let Op, be the group 
of connected components of Jp,. For M a finite abelian group, let M* := 
Homz(M,Q/Z) be its Pontrjagin dual. Theorem 4.3 of [33] asserts that 
there is a Hecke equivariant exact sequence: 


(3.3.4) 0+ Ki X(Jo(MI3,)/mgX (Jo(MI)Z,) > Df, > Cy 0, 


with K, and C, “Eisenstein” in the sense we saw in the previous section. 
Likewise, one has an exact sequence: 


(3.3.5) O—- Ky X(Jo(Ma)@,)/eta.X (Jo(Ma)z.,) + OF C, — 0 


with K, and C; “Eisenstein”. Since p is modular of type (M1,2,1)g 
p arises from a maximal ideal of To(M1), in the way we have seen in Sec- 
tion 3.2. It is not hard to see that then p also arises from a maximal ideal 
m. of To(Mlq) (note that we do not claim that p arises from a newform 
whose level is divisible by g). More precisely, we have a maximal ideal m 
of To(Mlq), an embedding of k := Tp(Mlq)/m into F,, a two-dimensional 
k-vector space V with an action by Gg with F, @ V giving p, such that 
Jy(M1q)(Q){m] is isomorphic to V* for some positive integer \ (this A is 
called the multiplicity at m of p in Jo(Mlq)). Let uw be the multiplicity at 
m of pin Jg: Jo(Q)[m] = V* (it follows again from [2] that Jg(Q)[m] is 
semi-simple). It is clear that p > 0. 

From now on we suppose that p is not modular of type (Mq, 2, Dg. 
From this assumption and the exact sequences above Ribet then derives 
that 24 < A and that 2A < 2, which gives a contradiction because we 
know that 4 > 0. So let us describe the arguments of Ribet. One starts 
by localizing the exact sequence (3.3.5) at m; this shows that p,m 1S 
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zero, because m is not in the support of X (Jo(Ma)s, )- This implies that 
Jg, (F1)[m] is isomorphic to V“ (as k-vector spaces), so that: 


(3.3.6) dim, (X (Jr, ) @T (Mig) k) = QU 
The exact sequence (3.3.3) shows that: 
(3.3.7) dim, (X(Jo(Milq)r,) @T9(Mlq) k) => dim,(X (Jr, ) @19(Mlq) k). 


Next we have the following exact sequence, obtained by replacing N by 
Mig in (3.2.2): 


(3.38) 0— Jo(Mig)t, — Jo(Mlq)r, — &o(Miq)r, — 0. 

Since ®o(Mlq)r, is “Eisenstein”, it follows that Jo(Mlq)(Q)[m] specializes 
into Jo(Mlq)?, (Fi). As in (3.2.3), the normalization of X9(Mlq)p, induces 
a short exact sequence: 

(3.3.9) 0 — To(Mlq)r, > Jo(Mlq)p, > Jo(Ma)z, — 0. 

Since p is not modular of type (Mq, 2, lg , it follows that: 

(3.3.10) dim, (X (Jo(Mlq)r, ) @T(Mlq) k) = 2). 


Ribet shows that the Frobenius endomorphism of Fi, is equal to q7,. It 
follows from this that p(Frob,) acts as a scalar (ie., an element of k) 
on Jp, (Fq)[m]. But by the choice of q, p(Frobg) is in the conjugacy class 
of G - . It follows that: 


(3.3.11) dim, (X (Jr, ) @T9(Miq) k) <p. 


The same argument applied to the maximal torus To(Mlq)r, in Jo(Mlq)r, 
gives: 


(3.3.12) dim, (X(Jo(Mlq)r,) @ry (mig) &) < A- 

Lemma 3.3.13 We have dim,(X (Jo(M))§,) @r (mig) &) < b- 

Proof. First note that n, is inm. The exact sequence (3.3.4) shows that: 
dim,,(X(Jo(M1)g,) @19(attq) &) = dime(®f,) @ry(miq) & = dimy(p, [m]). 
Grothendieck’s description [18, 7, IX, §11] of pf, gives an exact sequence: 


(3.3.14 0— X(Jp)— X(Jr,)’ — Op, — 0, 
Fy q q 
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where for M a Z-module MY denotes its Z-dual, and where Jp, is just 
Jg,, but with the dual To(Miq)-action: ¢ in To(Mlq) acts as t*, the 
dual of the endomorphism given by ¢ (this uses the natural autoduality 
of jacobians). This makes the sequence (3.3.14) Hecke equivariant. Con- 
sider multiplication by p on the exact sequence (3.3.14). Applying the 
snake Lemma and then taking kernels for m gives an injection of Sy, [m] 
into (X(Jp,) ®z Fp)[m]. Ribet shows, using more results from [18, 7, IX, 
§11] and a description of the action of Frobyg on Hom(X(Jr,), up) and on 
X(Jp,), that one has an exact sequence: 


(3.3.15) 0 - Hom(X (Jp,), 4p) [m] > J(Q)[m] > (X(Jp,) ® Fp)[m] — 0. 


The middle term in this sequence has dimension 2: by the definition of py. 
The fact that p(Frob,) has the two distinct eigenvalues 1 and —1 then 
implies that the other two terms both have dimension yp. O 


We can now finish our description of the proof of Theorem 3.3.1. The 
sequence (3.3.2) gives: 


(3.3.16) 
dim, (X(Jo(Mlq)r,) @t9(mig) &) 9 < dim, (X (Je) @rQ aig) &) 
+ dim, (X (Jo( M1, ) @19(Miq) k). 


Combining (3.3.6), (3.3.7) and (3.3.12) gives 24 < 4. On the other hand, 
combining (3.3.10), (3.3.16), (3.3.11) and Lemma 3.3.13 gives 2A < 2y. 


4 Dealing with the Langlands—Tunnell Form 


Let & be a semi-stable elliptic curve over Q, with p3 irreducible. It follows, 
Proposition 2.1, that p3:Ggq — GLo(F3) is surjective. The aim of this 
section is to show that p3 is modular of type (V(p3), 2, 1)g, if p3 is finite 
at 3 and of type (3.N(ps3), 2, 1g, if p3 is not finite at 3. 


4.1 The Theorem of Langland and Tunnell 


The first step is to show that p3 is modular, by lifting p3 to a continuous 
representation of Gg with values in GLo(C), i.e., a continuous representa- 
tion p:Gg@ — GlLe(C) (for the discrete topology on C) such that p3 is a 
reduction modulo 3 of p, and applying a theorem of Langlands and Tunnell 
to p. To do this, we compose p3 with a two-dimensional complex represen- 
tation of GL2(F3). The irreducible representations of the groups GLo(k), 
where k is a finite field, are well-known, see for example [6]. If ¢ denotes 
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the cardinality of k, then the irreducible representations have dimensions 
1,q-—1, ¢orq+1. Those of dimension one factor via the determinant. 
Hence for g > 3 there are no faithful two-dimensional representations. 

In the case of interest for us, i.e., g = 3, there are three irreducible 
two-dimensional representations. One of them can be realized in GL2(Z), 
but it is not faithful (it factors through the quotient 53). The other two are 
Galois conjugates and can be realized in GL2(Z[./—2]). Let 1: GL2(F3) > 
GL2(Z[/—2]) be one of these, and let p := mep3:Gg — GLo(Z[V—2]). 
From the character table given in [6], one sees immediately that 7 is faithful 
(our 7 is one of the (A) with A of order eight), and that the restriction of 
m to the subgroup (5 :) is the regular representation. It follows that p is 
odd and that det ep is the composition of v3: Gg — F3 with the embedding 
of F3 into C*. Looking again at the character table shows that reduction of 
p modulo one of the two maximal ideals containing 3 gives p3. According 
to [37, §5.3] the conductor N(p) of p is 3"N(p3)?, for some m > 0. 

This is explained as follows. Let 1 4 3 be a prime. If 3 is unramified 
at 1 then p is unramified too. Suppose that p3 is ramified at J. Since & is 
semi-stable, p3(J;) is a conjugate of the subgroup (j +). From the character 
table one sees that (4 ) has trace —1, hence its eigenvalues are the two 
roots of unity of order 3. This means that the dimension of the space of 
J,-invariants for p3 is one, and that it is zero for p. Hence the exponent of 
lin N(p) is twice that of J in N(p3). It will be of use for us to determine 
the exponent m of 3 in N(p) in the case where p3|7, is isomorphic to the 
direct sum x3 @ 1. In this case, in which the ramification is tame, (1.6) 
shows that m= 1. One can see easily that in all other cases m > 1. 

A deep result of Langlands and Tunnell ((25] and [46]) says that p is 
modular of type (N(p), 1, det(p))zy=3}, i-e., there exists a cuspidal eigen- 
form f of level N(p), of weight 1 and with character det(p) such that 
p = py, where p#:Gq — GL2(C) is the representation associated to f by 
Deligne-Serre in [11] (see Chapter VI, Theorem 1.3). At this point we 
know that 3 arises from the modular form f. 


4.2 Some Purely Mod 3 Arguments 


The second step is to show that p3 is modular of type (N (3), k(ps), 1)5, 
First we want to see that p is modular of type (NV (3)”, k)s , for some 
k > 1. This is folklore. Ribet has given a proof of this in |[32, §2] for 
p > 2. We remark that there are (unpublished) conceptual short proofs, 
valid for all p, using either the results of Katz and Mazur in [24] on the 
reduction mod p of modular curves (the idea is just to restrict the form to 
a suitable irreducible component of this reduction), or the representation 
theory of GL2(F,) on F,-vector spaces and some group cohomology or etale 
cohomology. 
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We can now assume that p3 is modular of type (N(p3)7, ke,» for some 
k > 1. We want to show that it is modular of type (Mp3), k(ps))g,, i-e-, we 
want to adjust the weight. Theorem 4.5 of [15] says that this adjustment 
of the weight can always be done; in [15] one can find a detailed proof. 
We will now describe only the main features of that proof, and indicate a 
simplification, due to Taylor, in the case of our p3. 

The proof of Theorem 4.5 of [15] runs as follows. Let p be a prime 
number and p:Gg — Glo(F,) be irreducible, continuous and modular of 
some level N which is prime to p. Then one knows that there exist a in 
Z/(p—1)Z and k in {2,...,p +1} such that 


p= pz @Xp 


for some eigenform f of type (N, Ke Since f has low weight, i.e., k < p+, 
one knows that p;|z, is an extension of the trivial character by xe if f is 
ordinary (i.e., Tpf #0) and that it is the direct sum of p*-! and /*~* if 
f is supersingular (i.e., T,f = 0); here w and y’ are the two fundamental 
characters of level two that occur in Definition 1.7. The first part of this 
result is due to Deligne (see [17] for a proof). The second part was first 
proved by Fontaine (see [15] for a proof). It follows that a and k are 
completely determined by plz,, except if p|z, is a direct sum of two distinct 
powers of xp or if it is a non-split extension of x3 by xZ*". 

In the first case there are two candidates for a; in the second case one 
cannot decide between the values 2 and p+1 for k. In the second case Serre 
conjectured, and Mazur showed under the assumption that p > 2, that one 
can take k = 2 if and only if p ® x,° is finite at p. Recall that Mazur’s 
argument was described in §3.2. In the first case, Serre conjectured that 
both candidates for a should work and called the two forms corresponding 
to the two values of a companions of each other. The fact that such com- 
panion forms exist was proved in [17], [7] and [16], except for some cases 
when p = 2. With these results, it remains to construct a form f’ of type 
(N,k’ \e,» with k’ as small as possible, such that p & pr. 

This last equality is equivalent to: a; = /*a, for all primes / not dividing 
Np, where the a; and a, are the eigenvalues of f’ and f for T;. There is an 
operation on modular forms over F, that has the effect of the derivation 
qd/dq on q-expansions. This derivation @ is described for forms of level 
one by Swinnerton-Dyer and Serre in [44], and in general, in terms of the 
Gauss-Manin connection, by Katz in [23]. The properties proved about 
6 in [23] imply that one knows exactly how to construct f’: one should 
take f’ := A~"6°(f) with n as large as possible (here A denotes the Hasse 
invariant), or the same expression with f replaced by its companion if it 
exists. The results in [23] imply that one knows the value of n. 
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Let us now look at our p3. We want to show that we do not need the re- 
sults on companion forms to show that p3 is modular of type (N (p3)?, k(p3)) 
(this is an observation of Taylor). We know that p3 or p3 @ x3 arises from 
a form of weight k with 2 < k < 4. If E& has bad or good and ordinary 
reduction at 3, then p3|z, is an extension of the trivial character 1 by x3; 
k(p3) = 2 if pg is finite at 3 (this is the case if and only if the exponent of 
3 in the discriminant of F is a multiple of 3), otherwise k(p3) = 4. If # 
has good and supersingular reduction at 3, then p3|z, is the direct sum of 
w and wv’, the two fundamental characters of level two. 

Suppose first that E is good and supersingular at 3. Then p3 @ x3 
cannot arise from a form of weight between 2 and 4, because its restriction 
to Iz is not of the_right form. Hence ps itself arises-from an eigenform f 
of weight k with 2 < k < 4. Because det(p3) = x3, we must have k = 2 
or k = 4. It remains to see that we can take k = 2. So suppose that f 
has weight 4. We claim that then f = Ag, where A is the Hasse invariant 
and g an eigenform of weight two. To prove this, it suffices to show that 
f vanishes at the supersingular points of the modular curve on which it 
lives. Now if f would not vanish at all the supersingular points, then the 
regular differential form w(f) on Xo(3N(p3))g, that will be constructed 
in the next section has a non-zero residue at some supersingular point. A 
study of the action of 73 then shows that T3f = +f, but we have 73f = 0. 
The eigenform g has the same eigenvalues as f, hence p3 = pg. 

Suppose now that # has bad or good and ordinary reduction. Then 
p3|z, is an extension of 1 by x3. Let us first deal with the case where 
this extension is not split. Then p3 @ x3 cannot arise from a form of 
weight between 2 and 4, hence there is a form f of weight 2 or 4 such that 
p3 = pz. If k(p3) = 4 we have k = 4 and there is nothing to prove. So 
suppose that k(p3) = 2. If k = 2 there is nothing to prove, so suppose 
moreover that k = 4. Let w(f) be the differential form on X9(3.N (P3)”)e, 
as constructed in the next section. Since w(f) can be lifted this shows that 
ps3 is modular of type (3N(p3)?, 2, 1), : Since p3 is finite at 3, Mazur’s 
result (Theorem 3.2.1) shows that p3 is modular of type (N (93), 2, 1)g,- 
It remains to deal with the split case. So suppose that p3|7, = ¥3@1. Then 
we have seen that the weight one form f given by Langlands and Tunnell 
has level 3N(p3)?. Then f£,,, with y:F3 — C* non-trivial, shows that 
p3 is modular of type (3N(p3)?, 2, 1)g,- Theorem 3.2.1 shows that p3 is 
modular of type (N(p3)?, 2, 1g, - 
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4.38 Differential Forms and a Version 
of Carayol’s Lemma 


In the last section we saw that p3 is modular of type (N(p3)*, (ps3), 1)5,- 
Now we want to deduce that 3 is modular of type (N(ps3)?, 2, 1g, if p3 is 
finite at 3 and of type (3N(ps)?, 2, 1)g, otherwise. It is well known that 
cuspidal modular forms of a given type (N,2,1)x, where K is a field of 
characteristic zero, correspond via the Kodaira—Spencer isomorphism to 
differential forms on the modular curve Xo(N)x, i.e, to sections of the 
sheaf 21 on Xo(N)x. This correspondence is compatible with the ac- 
tion of Hecke operators. We define N := N(p3)? if p3 is finite at 3 and 
N := 3N(p3)? otherwise. We will show in fact that there exists a non- 
zero differential form w on Xo(N a, which is an eigenform for the Hecke 


algebra and whose associated Galois representation p,,:Gg — GLe (F3) is 
isomorphic to p3. We will construct this w by first constructing it over Fs, 
which is in fact the hardest part (it involves a version of Carayol’s lemma); 
a standard argument from commutative algebra shows that it can be lifted 
to an eigenform (see for example [17, §9]). 


Let us first do this construction in the case where p3 is finite at 3. So 
then N is prime to 3 and we have a cuspidal eigenform f of type (N, 2, le, 
We want to construct a differential form w on Xo(NV)g, which is an eigen- 
form for all Hecke operators, with the same eigenvalues as f. The problem 
is that in general we cannot view f as a section of a sheaf w®? on Xo(N)z, 
and apply a Kodaira-Spencer isomorphism, because X(N ie, does not 
carry a universal family of generalized elliptic curves with a fevel struc- 
ture of type Ip(N). We will see that this problem is not just artificial, 
in the sense that for certain eigenforms of type (N, 2, 1g, there does not 
exist a differential form as we want; it will exist for our f because pg is not 
induced from Ga JV=3)" 

Let n > 3 be an integer prime to 3. According to (1.2) we can view 
f as a section of the sheaf w®? on the smooth projective curve X := 
MPN), P(n)]g,) over F; that carries a universal family of generalized 
elliptic curves with a point of order N and a trivialization of the n-torsion. 
Let Y denote the curve Lol de, )g,- Then Y is the quotient of X by the 
action of the group G := (Z/N ‘By x GL2(Z/nZ) acting on X, and f is 
a G-invariant section of w®?. Let w(f) be the global differential form 
on X obtained by applying the Kodaira—Spencer isomorphism to f. The 
functorial properties of this isomorphism imply that w(f) is G-invariant. 
We would like to conclude that w(f) is the pullback of a unique differential 
form, that we will also call w(f), on Y. The uniqueness is guaranteed by 
the fact that the morphism 7: X — Y is separable, which implies that the 
pullback morphism 7* on differential forms is injective. The problem is the 
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existence. The rest of this section is motivated by (28, II, Lemma 4.4], and 
its proof. 

Let V be the biggest open part of Y over which 7 is etale. Note that G 
acts on X via its quotient G by the subgroup generated by (—1,—1). The 
group G acts faithfully, hence V is the complement in Y of the image under 
x of those points of X with non-trivial stabilizer in G. Since t:77!V — Vis 
etale, the restriction of w(f) to 7~'V is the pullback of a unique differential 
form w(f) on V. So we have to show that this w(f) has no poles in the 
complement of V. The information we have is that the pullback of w(f) to 
X has no poles. 

Let y be a point of Y over which z is ramified, and let z bein w~+y. Let 
s and t be uniformizers at x and y, respectively. Let e be the ramification 
index at z, ie, t= s°u with u in OX ,. Let r be the valuation vz(dt) at 
z of dt. Since dt = s*-!(e + u's)ds, where u’ = du/ds, one has r > e —1 
with equality if and only if a is tamely ramified at r+. We have 


Ux(w( f)) = evs(w(f)) +r. 


It follows that w(f) is regular at y if a is tamely ramified over y, so it 
remains to look at those y over which 7 is wildly ramified. Such points 
y all correspond to the elliptic curve E of j-invariant zero over F3. The 
automorphism group of & is isomorphic to F3 x GLo(F2); the projection 
to F3 comes from the action of Aut(#) on the tangent space at zero of E, 
the other projection comes from the action on the 2-torsion of E. The 
invariants e and r are the same at all x at which a is wildly ramified; note 
that we have e = 3 or e = 6. A global calculation as in [28, II, §2], using 
the Hurwitz formula for 7, shows that we have r = e, which means that the 
wild ramification is of the mildest form. (In fact, this can also be derived 
from Table 1 of [28, II, §2].) It follows that w(f) has a pole of order at 
most one at y. 

Now the number of y over which 7 is wildly ramified can be easily 
computed: it is 1 if N = 1, it is 0 if N is divisible by a prime number 
congruent to —1 modulo 3, and otherwise it is 2”-! where v is the number 
primes dividing N. In the situation of [27, II, Lemma 4.4] this number is 
zero or one, and it follows that in fact w(f) is regular because the sum of 
its residues must be zero. But if all primes dividing N are 1 modulo 3 and 
v > 1, there really exist eigenforms f such that w(f) has poles, so we have 
to show why our w(/) is regular. 

Suppose that w(f) is not regular. Let D be the set of y at which 7 
is wildly ramified, viewed as an effective divisor on Y, and let V be the 
F'3-vector space of F3-valued functions on D. Consider the map 


R: H°(Y,Q1(D)) = V 
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which sends a form to its residues at the y in D. Then the image of w(f) 
is not zero. There is a natural action of the Hecke operators on V which 
is compatible with R. For n > 1 the Hecke operator T,, acts on V via 
isogenies of degree n between elements y of D, hence by endomorphisms 
of degree n of &. Let o denote one of the two automorphisms of order 
3 of E. An endomorphism @¢ of E of degree n that does not commute 
with o contributes zero to the action of T;, on V, since the correspondence 
inducing T;, is wildly ramified at ¢. Let | be prime and congruent to —1 
mod 3. Then & has no endomorphism of degree 1 commuting with o, hence 
T; acts as zero on V. It follows that T; f = 0 for all such J, which contradicts 
that 3 isnot induced from Gg =a). 

Let us now consider the case where pg is not finite at 3. Then N = 3’, 
with N’ = N(p3)* prime to 3, and k(p3) = 4. In this case Xo(N)g, has 
two irreducible components, both isomorphic to X(N’ ie, , which intersect 
transversally at the supersingular points. The sheaf of Kahler differentials 
on X9(N)z, is not locally free of rank one at the double points over Fs, 
and it is better to work with the dualizing sheaf 2 on it. This sheaf 
can be obtained as follows: let X9(N)7" be the smooth locus of X9(N)z, 
(ie., the complement of the double points), let 7: Xo(N)Z" — Xo(N)z, be 
the inclusion and let Q* be the sheaf of Kahler differentials on Xo(N Vegi 
then Q = 7,01. The dualizing sheaf Q is locally free of rank one and it is 
dualizing in the sense of Serre duality. For a more detailed description of 2 
in the context of modular curves see [29, §§6-7], [17, §§8-9] and references 
therein. 

Let Y := Xo(N)g,, and let Yo (resp., Yoo) be the irreducible com- 
ponent of Y containing the cusp 0 (resp., oo). The restrictions of 2 to 
Yo and Y, are the sheaves of Kahler differentials with poles of order 
at most one at the supersingular points. Let n > 3 be prime to 3, let 
X = M(,(N’),P(n)]g,) and let G := (Z/N’Z)* x GLo(Z/nZ). Then f 
is a G-invariant section of w* on X. It follows that f/A (recall that A is the 
Hasse invariant) is a rational section of w®? with poles of at most order one 
at the supersingular points. Applying the Kodaira—Spencer isomorphism 
to f/A gives a G-invariant rational differential form w(f) on X which has 
poles of order at most one at the supersingular points. We identify the 
quotient of X by G with Y,,. Since w(f) is G-invariant, we can view w(f) 
as a rational differential form on Y,., with poles only at the supersingular 
points and at the points where X — Y,, is ramified. A calculation as above 
shows that there are only poles of order at most one at the supersingular 
points. The proof of [17, Prop. 9.3] shows that there exists a section of 9 on 
Xo(N)z, which is an eigenform for the Hecke algebra and whose restriction 
to Yoo is w(f). 

We end this section with some remarks. First of all, one can show that 
the Hecke action on the F3-vector space V that occurs in the arguments for 
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p3 finite at 3, is “Eisenstein”: for 1 prime and not dividing 3N, J) acts as 
1+1onV. Hence it would have been sufficient to use that p3 is irreducible. 
For more general quotients of X by subgroups of the form H x GL2(Z/nZ) 
of G, the vector space V is not necessarily “Eisenstein.” This is related to 
results on groups of connected components of Néron models in [36]. 
Secondly, instead of studying in detail the wild ramification in the mor- 
phism X — Y, we could have used Theorem 3.2.1 as follows. Let q be 
any prime number that is congruent to —1 mod 3 and that does not di- 
vide N. Then, replacing N by qN, one gets X — Y tamely ramified, 
hence a differential form on Xo(N DF: This shows that p3 is modular of 
type (Vq, 2, 1)g,- Then Mazur’s result shows that p3 is modular of type 
(N, 2, 16g, - One reason to give the argument above is to illustrate the 
problems one gets when interpreting modular forms as differential forms. 


4.4 Carayol’s Reductions 


At this point we know that p3 is modular of type (N(p3)?, 2, 1)g, if ps 
is finite at 3 and of type (3N(p3),2,1)g, otherwise. We want to show 
that p3 is modular of type (N(ps), 2, eg, if p3 is finite at 3 and of type 
(3N (p3), 2, 1g, otherwise. Before explaining how this is done, it is good 
to recall some results of Langlands, Deligne and Carayol (see [5]). Let p 
be a prime number and let f be a newform of some type (N,k, E)g, with 
k > 2. Then this gives us a representation pz: Gg — GL2(Q,), determined 
by the property that it is unramified outside Np and that for / not dividing 
Np the Frobenius element p;(Frob;) has trace a;(f). On the other hand, 
there is also a representation 7: GLa (Z @Q) — GL(V), with V an infinite 
dimensional Q,-vector space, associated to f in the following way. Let W 
be the direct limit, taken over all multiples n > 1 of N, of the Q,-vector 
spaces H? (M(IP(n)]g,), 4"). It is clear that GL2(Z) acts on W, and it 


is not hard to see that this action extends to one of GL2(Z @Q). Then V 
is the subspace of W that is generated by the g(f), for g in GL2(Z @ Q). 
One knows that V is an irreducible representation of GL2(Z @Q) and that 
V = @|Y, is the restricted tensor product, over all primes J, of irreducible 
admissible representations 7-1: GLa(Q;) — GL(Vj). The result alluded to 
describes, for all | # p, the restriction p;, of py to some decomposition 
group at / in terms of w;,. It follows from this result that the conductor 
N(p;) of the reduction p;: Gg — Glo (F,) divides N (here we suppose that 
p; is irreducible, since otherwise it is not well defined). Carayol [4] and 
Livné [26] have classified, in terms of the ws1, the 1 # p dividing N/N(p,). 

The strategy for proving that 3 is modular of type (N (ps3), 2, 1g, if ps3 
is finite at 3 and of type (3 (p3), 2, 1)g, otherwise will be the following. 
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Suppose that we know that p3 is modular of type (N, 2, 1g, for some 
N dividing N(p3)?. Let 1 # 3 be a prime number and suppose that /? 
divides N. Then we want to show that p3 is modular of type (N’, 2, eg, 
for some N’ dividing N with I? not dividing N’. Of course, this suffices. 

So suppose that f is a cuspidal eigenform of type (N, 2, 1g, with N 
dividing N(p3)* and that | # 3 is a prime number such that I? divides N. 
The newform associated to f has level dividing N, so we may in fact assume 
that f isa newform. The classification of Carayol and Livné then says that 
pu is of one of the following two types: 


1. p71 is a direct sum of two ramified characters a, 8: Gog, > Q3 whose 
reductions @, 3: Gg, — F3 are unramified, 


2. pe = Inde with K the unique unramified quadratic extension of 
Q, and w:Gr — Q3 a ramified character with unramified reduction 
wiGK- F3. 


Let us deal with the first case first. Let x3:Gq@ — Z3 be the character 
giving the action on all roots of unity of 3-power order. Recall that since 
f has trivial character and weight two we have det(p;) = ¥3. This implies 
that a@ is unramified. There is a unique character € of FF = Gal(Q(G)/Q) 
with values in the kernel of Zs — F such that ae is unramified. Let f’ be 
the newform corresponding to the twist f @e of f by e. One way to express 
this is to say that a,(f’) = an(f)e(n) for all n prime to J. Another way is 
to say that p>» = pz @e. Anyway, f’ is a newform of type (N/1,2,€*)5, 
giving p3. Since € = 1 and pg is not induced from Q(./—3) Carayol’s 
Lemma implies that pz is modular of type (N/I, 2, l)g,: 

Let us now say something about case 2. The analog of this case with 
3 replaced by a prime p > 5 is treated in [4, §5], and uses the Jacquet- 
Langlands correspondence to switch to a certain Shimura curve. The gen- 
eralization to the case p = 3, using that p3 is not induced from Q(/—3), is 
explained in [12, §5]. One might also say that this generalization is done in 
[4], if one admits that the remarks in [4, §4.4] concerning modular curves 
also hold for the Shimura curves used in [4, §5]. We will now sketch the 
argument. 

Let q be a prime number not dividing 3, such that p3(Frob,) is con- 
jugated to p3(c), with c a complex conjugation. Then a result of Ribet 
(see [34]) says that there exists a newform f’ of type (N’q, 2, 1g,» with 
N’ dividing N, such that py, = p3 and with py special, i.e, pyrg is a 
non-split extension of an unramified character a with a? = 1 by axX3 (see 
also [47, II, Lemma 2.3]). If py. is not in case 2, then one applies the 
method to deal with case 1 to get rid of the /? in the level and then one 


SERRE’S CONJECTURE 239 


applies Mazur’s Theorem 3.2.1 to get rid of g. So we may assume that py, 
is in case 2. Let B be the quaternion algebra over Q with discriminant pq. 
By the Jacquet-Langlands correspondence and the results of [5], py can 
be constructed from the 3-adic Tate module of the jacobian of a Shimura 
curve of a certain level associated to B. On this Shimura curve one has 
an action by the group Ff, x Fie which is analogous to the action of the 
diamond operators on modular curves. A version of Carayol’s Lemma then 
shows that p3 actually arises from the quotient of this Shimura curve by 
that group. Switching back to modular curves by the Jacquet—Langlands 
correspondence then shows that g3 is modular of type (Ng, 2, lg. with 
N" dividing N/l. Then one finishes by applying Theorem 3.2.1. 
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Before this conference I had never been to any mathematics gathering 
where so many people worked as hard or with such high spirits, trying to 
understand a single piece of mathematics. 

The talents of the organizers helped shape this collective activity which 
concentrated on the work of Wiles [W] and that of Taylor-Wiles [T-W]. 
Among their ideas for doing this, there is one which I feel should be recorded 
in these Proceedings (because of its novelty, and its possible usefulness in 
future conferences): since the lecture hall of the conference held roughly 
400 participants, and attendance was usually that high, it was impractical 
for the audience to be given the opportunity to ask questions either during 
a lecture or immediately afterwards. The solution to this difficulty was to 
set up a special “question room,” open every evening, in which a team of 
“experts” would be available to either answer, or explore, any question! 

The organizers asked me to lecture about deformation theory; they also 
gave me a specific assignment meant to fit in with their over-all program: I 
was asked to include in my lectures an explanation of a particular theorem 
which describes the Zariski tangent space of the universal deformation rings 
attached to specific Galois deformation problems. 

I hope the introductory material about representations which comprises 
Part One of these notes, will be helpful to people who want to get a glimpse 
of this subject, and of its deformations. 


DEFORMATION THEORY OF GALOIS REPRESENTATIONS 245 


As for the general concept of “deformation,” it is everywhere in math- 
ematics and to get a quick sense of its various manifestations in the liter- 
ature, you can consult the annotated bibliography of this subject, com- 
piled by C. Doran (eventually to be included in [D-W]). This can be 
downloaded by anonymous ftp at (abel.harvard.edu) in the directory 
mazur handout. 

Hendrik Lenstra, in his lecture in the conference, recounted that twenty 
years ago he was firm in his conviction that he DID want to solve Diophan- 
tine equations, and that he DID NOT wish to represent functors — and 
now he is amused to discover himself representing functors in order to solve 
Diophantine equations! Part Two of these notes is all about functors: here 
I-concentrate on the specific task assigned to me; namely, to show that 
“first-order infinitesimal” information concerning the universal deforma- 
tion ring attached to a representation p can be expressed in terms of group 
cohomology (of the adjoint representation of p). This is quite a general 
phenomenon, does not even depend upon the representability of the defor- 
mation problem, and has an appropriate variant for deformation problems 
subject to conditions. To get the easiest-to-state and most-flexible result, 
it made sense to me to do a bit of housekeeping in the theory: firstly, 
to formulate a general functorial notion (called nearly representable in 
§18 below) which is general enough so that every representation is “nearly 
representable” and which is stringent enough so that any “nearly repre- 
sentable” functor has well-working Zariski tangent modules; secondly, to 
give appropriate functorial axioms which cover the bestiary of different 
specific “conditions” that will eventually be imposed on the various defor- 
mation theories, so as to be able to deal with all of them somewhat more 
systematically. This leads to the formal definition of deformation con- 
dition given in §23 below. I want to thank B. Conrad, F. Gouvéa, J.-P. 
Serre, and J. Tate for their valuable suggestions and for their corrections 
to a preliminary version of this article. I am also grateful to Carol Oliveira 
for the excellent job she did in typing my manuscript. 
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Part One 


CHAPTER I. GALOIS REPRESENTATIONS 


§1. The Galois group of a number field, and a way of studying 
it. If K is a field and K a choice of separable algebraic closure of K, let 
Gi = Gal(K/K) denote “the” Galois group of K. The group Gx is a 
profinite topological group with its natural Krull topology, where a base of 
open subgroups is given by the “fixers” of finite extension fields of K in K. 

Now let K be a number field; i.e., a finite extension of Q. If S is a finite 
set of non-archimedean places of K, or equivalently of non-zero prime ideals 
in the ring of integers of K, then Gx,g will denote the Galois group of 


Kg := the maximal algebraic extension of K in K unramified outside S 


(with no condition imposed at the archimedean primes of K). An equiva- 
lent definition of Kg is that it is the union of all finite extensions of K in K 
whose relative discriminant is not divisible by any prime outside S. Fixing 
a finite set of primes S and a number d, the classical theorem of Hermite- 
Minkowski assures us that there are only a finite number of field extensions 
of K inside K of degree < d unramified outside S. But Ks/K is often of 
infinite degree; equivalently, the group Gx_5 is often infinite (and certainly 
is so if, for example, S contains all primes of K lying above some prime 
of Q). The profinite group Gx,s (given its Krull topology) is naturally a 
quotient of Gx. The kernel of the projection homomorphism G, — Gx,s5 
is the closed normal subgroup generated by all inertia subgroups of Gx 
attached to places v in S. The Galois group Gx is (countably) infinitely 
generated as a topological group. 

What is the “structure” of Gx,s — whatever that means? It is not even 
known whether or not Gx 5 is finitely generated as a topological group 
(although this has been conjectured to be the case by Shafarevich about 
thirty years ago). Here is a property, weaker than the property of being 
“topological finitely generated,” which is known to hold for the groups 
Gx,s and which will serve us well in our theory below. 


Definition. Let p be a prime number, and II a profinite group. Let us 
say that II satisfies the p-finiteness condition if for all open subgroups 
IIp C II of finite index, there are only a finite number of continuous homo- 
morphisms from IIp to Z/pZ. 


For a discussion of this property and its various equivalent formulations, 
see [M 1]. 

The groups II = Gx,gs satisfy the p-finiteness condition for all prime 
numbers p. The reason for this is that any open subgroup Ip C II = Gx,s 
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of finite index is again of the form Gx,\s, for some finite field extension 
Ko/K and the set of continuous homomorphisms, 


Homeont(G Ky, 5 ? Z/pZ) = Homeont (GC, 55 Z/pZ) 


is finite, as can be proved as an exercise using (you choose!) either some 
Kummer Theory or a bit of Class Field Theory. 

Here, as below, the superscript “ab” means the maximal (profinite) topo- 
logical quotient group which is abelian; i.e., the quotient by the closure of 
the commutator subgroup. 

Nowadays it is generally understood that the salient “structure” needed 
to be studied in connection with arithmetic problems is not merely the 
topological group Gx,s. Consider this rather more elaborate structure. 
For each place v of K, an imbedding of K in an algebraic closure, Ky, of 
the completion of K at uv gives us a continuous homomorphism 


ty > Gua, > Gr.s; 


a change of imbedding K Cc K, changes the homomorphism i, by conju- 
gation. If v is nonarchimedean and not in S, then the homomorphism 1, 
factors through the quotient of Gx, by the inertia subgroup Ix, giving us 
a homomorphism 

ty: Gx,/Ik, — Gr,s. 


Since Gx, /Ix, is canonically isomorphic to G;,,, where ky is the residue 
field at v and k,, is the residue field of the valuation ring of K,, and since Gy 
has a canonical topological generator y, (called the “Frobenius” element: 
(y is the automorphism of k, which sends any element of k, to its |k,|-th 
power), the mapping 7, is determined by simply giving the image of y, 
under i,. There is usually no confusion caused by the practice of referring 
to the image of y, under 7, as “the Frobenius element,” Frob,, in Gx,s 
attached to uv, with the understanding that such a “Frobenius” element, is 
only unique up to conjugation. If v is real, then Gx, is cyclic of order two, 
and the “Frobenius element.” at such a v will simply mean the image of the 
nontrivial element of Gx,. We want to study the isomorphism class of the 
entire “package” 


= Gx.s, 
— the conjugacy classes of the homomorphisms 7, : Gx, — Gx,s 
for all places u of K. 


Equivalently, we want to understand the package 


= Gis: 

— the conjugacy classes of the Frobenius elements y, € Gx,g for 
all places v of K which are not in S, 

— the homomorphisms i, : Gx, — Gx,s for the finite set of places 
ve S. 
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In contrast to our lack of knowledge concerning the topological finite 
generation of Gx,5 we know that the local Galois groups Gx, are topolog- 
ically finitely generated, and we have a fairly developed understanding of 
some systems of generators and relations for them, thanks to the efforts of 
Neukirch, Koch, and others. 

We also have a reasonably satisfactory understanding of the abelian- 
ization (Gx,s)”, of Gx,s, as well as of the abelianization of the entire 
“package” above; this is the principal achievement of Class Field Theory. 
The special case of this when K = Q was known earlier (by the turn of the 
century). Explicitly, if S is a finite set of prime numbers, let yg stand for 
the set of all N-th roots of unity in Q where N ranges through all numbers 
whose set of prime divisors is contained in S. Then the maximal abelian 
extension of Q unramified outside a finite set of primes S is the subfield 
of Q generated by ys (a theorem of Kronecker and Weber). Moreover, we 
have canonical isomorphisms 


GP g ¥ Gal(Q(us)/Q) ¥ Aut(ys) =~ T] 2%, 
pes 


the second isomorphism above being essentially the content of the result of 
Gauss which established the “irreducibility of the cyclotomic polynomials.” 
The Frobenius element at any prime number @ not in S corresponds, under 
the above isomorphisms, to that element in [] Z} whose p-th coordinate is 
given by the integer @ in Zp, for each p € S. 

But can we extend our study of Gx,5 beyond describing its abelianiza- 
tion? One unavoidable point to contend with, if you want to go further, is 
that the group Gx is nonabelian — is defined only in reference to a choice 
of algebraic closure of K — and therefore is difficult to be pinned down 
more intrinsically than “up to conjugation.” A standard tactic (which 
might be called the “Tannakian approach”) suitable for such situations is 
to try to study representations of Gx (up to isomorphism) because the 
study of representations is insensitive to the fact that we know Gx only up 
to inner automorphism. From this perspective, one achievement of Class 
Field Theory has been to provide an adequate theory of one-dimensional 
representations of Gx gs: ie., representations into GL(C), the multiplica- 
tive group of C (or, more flexibly but with no more generality, into the 
multiplicative group of any commutative ring). To go further in our study, 
we are led, then, to think about Galois representations, i.e., continuous 
homomorphisms, 


(1) ; Pp: Gis > GLy(A) 


for A some topological ring, and any N = 1,2,.... To understand the 
“package” above we must understand such representations as well as their 
restrictions to the groups Gx, for all places v of K, ie., their “local be- 
havior.” In particular, if v is a place of K not in S, the restriction of p to 
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the group Gx, is given by simply giving the conjugacy class of the image, 
p(~v) of a Frobenius element ~, under p. The trace, a, := Trace,4(p(¢z)), 
is independent of the choice of Frobenius as it is independent of the rep- 
resentation p up to conjugation. It is therefore a well-defined invariant of 
the equivalence class of the representation p, for each choice of v not in S. 
As we shall see, in many instances, this data 


vi+adyE€A forv notin S 


will be enough to reconstruct p up to equivalence. Thanks to the Theorem 
of Chebotarev, even less data is often sufficient: e.g., one need only give the 
above data for v ranging through a set of-places of density 1 (outside S). 


§2. What coefficient-rings should we allow for our Galois repre- 
sentations? Since Gx, is a profinite topological group and since we are 
requiring the homomorphisms (1) to be continuous, the tightest fit, so to 
speak, would be if the receiving topological group GLy(A) were a profinite 
topological group as well. I hope this is enough to motivate the following 
choice: 

From now on in this article, a coefficient-ring will mean a complete 
noetherian local ring A with finite residue field k. The choice of k is usually 
fixed in our discussions. We will consecrate the letter p for the characteristic 
of k. Such a coefficient-ring A carries its natural profinite topology, a base 
of open ideals being given by the powers of its maximal ideal my: 


A = proj. lim. A/mm‘. 


By a coefficient-ring homomorphism let us mean a continuous ho- 
momorphism of coefficient rings 


A’ A 


such that the inverse image of the maximal ideal my, is the maximal ideal 
ma C A’ and the induced homomorphism on residue fields is an isomor- 
phism. 

If A is a coefficient-ring and p the characteristic of its residue field k, p is 
topologically nilpotent in A, and so there is a natural ring-homomorphism 
Z, — A. This ring-homomorphism would be a “coefficient-ring homomor- 
phism” if the residue field k were the prime field F,. In general, let W(k) 
be the “ring of Witt vectors of k,” that is, W(k) is the canonical discrete 
valuation ring extension of Z, which is absolutely unramified and which 
has residue field equal to k. Any coefficient-ring A with residue field k is 
naturally endowed with a continuous (“coefficient-ring”) homomorphism 
W(k) — A, which induces the identity on residue fields. (For the construc- 
tion and basic properties of the ring of Witt vectors, see [Se 1] or [Mat].) 
Our coefficient-rings are then naturally topological W(k)-algebras. 
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The group GLy(A) carries the corresponding profinite topology, 


GL(A) = proj. lim. GL(A/m4), 
v—-oco 

a base of open normal subgroups being the multiplicative group of N x N 
matrices with coefficients in A which, when reduced modulo a fixed power 
of m, become the identity N x N matrix. 

A continuous homomorphism (1) will be referred to as a Galois repre- 
sentation with coefticient-ring A. The integer N is called the degree of 
the representation. 


§3. Galois representations arise naturally. Given an elliptic curve 
E defined over a number field K, and an integer n, by the group of n- 
division points of E, denoted E[n], we mean the group of points of E 
rational over K, which lie in the kernel of the homomorphism 


BE &F 
Cre ns 


given by multiplication by n. The group Gx acts naturally as a group of 
automorphisms of the group E(K) of K-rational points of the elliptic curve 
FE, and induces an action of Gx on E[n]. Since E[n] is abstractly a product 
of two cyclic groups of order n, this natural action gives a continuous 
homomorphism, 


which factors through Gx,g where S comprises all prime divisors of n, and 
primes of bad reduction for #. The induced homomorphism 


PEn: Gx,g > Glo(Z/nZ) 


we might call the n-division point representation attached to &. Pass- 
ing to the projective limit of these n-division point representations as n 
ranges over the multiplicative system of natural numbers, or as n ranges 
over the direct system of all powers of a fixed prime number p, give repre- 
sentations 7 

PE :Gx,s =? GL2(Z), and 


PEp~ :‘Gx,s + GLe(Zp), 


respectively, where % is the profinite completion of Z, and Z, is the ring 
of p-adic integers. 


Example. The only n > 1 for which the n-division point representation 
attached to an elliptic curve & is “dead easy” to describe directly in terms 
of the defining Weierstrass form of the equation, y? = g(x), for E, isn = 2. 
Here g(x) is a cubic polynomial with distinct roots. The 2-division point 
representation. 

PE2: Gx,s + Glo(Z/2Z) 
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factors through the Galois group of the splitting field over K of the poly- 
nomial g(x), and pz. factors through the natural representation of the 
Galois group of that splitting field to the symmetric group 53 using the 
isomorphism, 


S3 = GLle(Z/2Z) 
which is unique up to conjugation. 


More generally: Going back to consideration of general n-division 
point representations associated to elliptic curves, a construction similar 
to the one involving elliptic curves, but beginning with an abelian variety 
of dimension g over a number field K provide Galois representations of 
degree 2g with coefficient rings Z/nZ, Z, and Z, as well. If we start with 
an abelian variety whose ring of endomorphisms rational over K contains 
a commutative ring A larger than the ring Z, we may get Galois repre- 
sentations with other coefficient-rings, as well (specifically, quotients and 
completions of A). 

We can construct Galois representations with coefficient-rings Z/nZ, Z, 
and Z, by considering the natural action of Gx on the étale cohomology 
groups of algebraic varieties defined over K. Related to this, there is the 
classical theory due to Shimura, Deligne, and Deligne-Serre, which attach 
to arbitrary classical modular eigenforms (of integral weights > 1) Galois 
representations of degree 2 with coefficient-rings equal to various comple- 
tions and quotients of the ring generated by the action of Hecke operators 
on the space of modular forms of given level and weight. 


CHAPTER II. GROUP REPRESENTATIONS 


§4. Group representations versus algebra representations. 
Given a positive integer N, a coefficient-ring A with residue field k, and 
a profinite group II, the set of continuous group-homomorphisms 


p: l— GLy(A) 


is in One-One correspondence with the set of continuous homomorphisms of 
A-algebras = 
r: A{[II]] ~ My(A), 


where A|[II]] is the completed group-ring of II with coefficients in A, 


A{[1]] = proj. ioe A[IT/TIp}, 


where II) runs through all open normal subgroups of finite index in II, and 
A[II/TIp] is the usual group-ring of the finite group II/IIp with coefficients 
in A. Here My(A) is the A-algebra of N x N matrices with entries from 
A. The correspondence r++ p comes by restriction, noting that II may be 
identified with a subgroup of the group of multiplicative units A[[IT]]* in 
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the ring A[[II]] and the algebra-homomorphism r restricts to a continuous 
homomorphism of groups of units, 


r* : Aj{I]]* ~ My(A)* = GLy(A). 
By the “underlying residual representation” to p, and tor, 
B:Il+GLy(k) and F: A[{I]] > My(h), 
we mean the composition of p and of r with the natural projections 
GLIn(A) — GLyn(k) and My(k) — My(k) 


respectively. 


Proposition. The residual representation 
p:W— GhLa(k) 


associated to p is absolutely irreducible if and only if the homomorphism r 
is surjective. 


Proof. This is well known if A is a field, i.e., if A = k: cf. [Bourb 1, Ch. 
VIII §13 n° 4]. It follows for general coefficient-rings A from Nakayama’s 
Lemma applied to the following diagram of A-modules: 


Image(r) Cc My(A) 


%S: . 
My(k). 


Corollary. (Schur’s Lemma) Let p: II > GLy(A) be a continuous repre- 
sentation with coefficient-ring A. If the associated residual representation 
Dp is absolutely irreducible, any matrix in My (A) which commutes with all 
the elements in the image of p is a scalar. 


Proof. Since the completion of the A-algebra generated by the image of p 
is equal to the image of r (i.e., is all of My(A) by the above proposition) 
any matrix commuting with all the elements in the image of p lies in the 
center of My (A). The fact that such elements are scalar matrices is valid 
for A any commutative ring with unit; it can be seen by directly checking 
what it means for a matrix to commute with the basic N x N matrices 
Ei; (which have a 1 as their entry in the 7-th row and j-th column and 0 
elsewhere). 


§5. Representations and their characters. Keeping the notational 
conventions of the previous paragraph, let p : II + GLy(A) be a represen- 
tation where A is a coefficient-ring with residue field & of characteristic p. 
We assume that the underlying residual representation p : II ~ GLy(k) 
is absolutely irreducible (or equivalently, by the proposition in §4, that 
r : A[[II]] — My(A) is surjective). 
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Proposition. Let p' : II + GLy(A) be a representation with the same 
character as p, i.e., such that Trace, p(g) = Trace, p'(g) for all g € II. 
Then p’ and p are equivalent representations. 


See [Ca] and [Se 2]. The following proof is taken from [Se 2]. 


Proof. Let r,r’ : A[[II]] ~ My (A) be the A-algebra homomorphisms cor- 
responding to p and p’. By hypothesis, the residual representation @ is 
absolutely irreducible. We shall first prove that p’, the residual repre 
sentation associated to p’, is equivalent to p and hence is also absolutely 
irreducible. Let 7,, denote the semi-simplification of #’. Then p and i, 
are semi-simple representations with the same character. It follows (cf. 
the proof of Th. 30.16 in [C-R])) that the multiplicity of any absolutely 
irreducible representation w occurring in p,, is congruent modulo p to the 
multiplicity of ~ in p. In particular, since p is absolutely irreducible, the 
multiplicity of p in p,, is 1+ p-p for some integer p > 0. But p and J, 
are both of the same degree, and therefore » = 0 and @ is equivalent to 
Ps. 50 pand p are both absolutely irreducible. By the proposition of §4, 
r and 7 are both surjective. We will be using this latter fact, along with 
the hypothesis that the character functions of r and of r’ are equal; i.e. 
Trace, (r(a)) = Trace,(r’(q@)) for all a € AI[[II]]. 
Define the A-module homomorphism 


y : A[[II]] > My (A) x My(A) 
by the rule y(z) = (r(z),r’(x)) and let 
Oc My(A) x My (A) 


denote the image of y. Since r and r’ are surjective, the VERY general 
principle that Serre calls “Goursat’s Lemma” applies (cf. [Se 2] ) which 
gives a precise description of the image of ®. The A-submodule © is given 
as follows: 

(x) There are two-sided ideals J C My(A) and Z’ C My(A) and an 
isomorphism of A-algebras f : My(A)/Z — My(A)/T’ such that © is the 
“graph” of f in the sense that 


& = {(a,a’) € My(A) x My(A) | f(a-Z) =a’ -T’}. 


But the only two-sided ideals 7 in My(A) are of the form J = J-Myn(A) 
where J C A is an ideal in A. (The proof of this is an exercise: the ideal 
J C A may be taken to be the ideal in A generated by all entries of all 
the matrices in 7.) Therefore the ideals 7, J’ occurring in («) are of the 
form I- My(A) and I’. My(A) for ideals I, I’ c A. Since the annihilators 
of the (isomorphic) A-modules My(A)/Z and My(A)/Z’ are I and I’, 


1Goursat’s Lemma works with My(A) replaced by e.g., any A-algebra. 
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respectively, we have J = I’. Now take any element x € I and consider the 
N x N matrix X which has z as entry in its first column and first row, and 
zeroes elsewhere. Let X’ denote the N x N matzix all of whose entries are 
zero. The couple of N x N matrices (X, X’) € My(A) x My(A) are in ® 
(by the description (*)). 
Therefore 
z = Trace(X) = Trace(X’) = 0 


ie., the ideal J vanishes, and ® is in fact the graph of an actual isomorphism 
of A-algebras f : My(A) — My(A). By [Bourb 2, Ch IT §5, ex. 2] any 
such isomorphism is inner; by construction, we have for =r’. It follows 
that p is equivalent to p’. 


Corollary. Let p,p’ : Gx,s — GbLy(A) be continuous representations. 
Suppose that one of these representations is residually absolutely irreduci- 
ble. Suppose further that 


Trace, p(Frobg) = Trace, p’ (Frob,) 
for £ running through a set of prime numbers (outside S) which is of Dirich- 
let density 1. Then p is equivalent to p’. 


Proof. This comes from combining the above proposition with the Cheb- 
otarev Density Theorem. 


§6. “Descent” of group representations. Schur-type Theorems. 
Suppose we are given a continuous representation 


Pp: II— GL n(A) 
such that the associated character function 
Xp(z) = Tracea(p(x)) € A 


has the property that it takes its values in a sub-ring Ap C A. Is there a 
“descent” of p to Ag, i.e., is there a continuous representation 


Po : II — GLy (Apo) 


such that when one extends scalars from Ag to A, pg becomes equivalent 
to p? The answer is YES for our coefficient-rings A, but descent is not 
necessarily valid for more general rings. For results along these lines, see 
[M 1] and (G]. The most general (and the most perspicuous!) such result, 
to date, is to be found in [Ca] and [Se 2]. Here is a brief account of it. 

Let R be a general commutative local ring, with maximal ideal mp and 
residue field k = R/mp. Let 


r: R{{I]] + My(R) 
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be the R-algebra homomorphism associated to a continuous representation 
p:Il— GLy(R). We assume that r is surjective. Let Rp be a local sub- 
ring of R with maximal ideal mp, = MRM Ro and residue field ky C k. 
Suppose that the traces, Tracer(r(a)), of all elements a € R[[II]] lie in Ro. 
The “descent question” posed above would have a positive answer if, for 
example, the image r(Ro/[II]]) C My(R) were isomorphic, as Ro-algebra, 
to the matrix algebra My(Ro). Now this is not the case in general but 
it is “almost” the case. The sense in which it is “almost” the case is best 
explained in terms of Azumaya algebras (cf.[K-O]). 


Definition. An Azumaya Algebra © over R& is a finite flat R-algebra 
such that =/mp®> is a central simple algebra over k. 

By the rank of an Azumaya Algebra over R, one means its rank as R- 
module. For example, the matrix algebra My(R) is an Azumaya Algebra 
over R of rank N?. 


The fundamental general result is the following 


Proposition. (Carayol, Serre) Let r : R{[II]] — My(R) be a surjective 
R-algebra homomorphism and let Ro be a local sub-ring of R as above. 
Suppose that the traces, Tracer(r(a)), of all elements a € R[[I]]] lie in Ro. 
Then the image 

Ro := r(Ro[[T]) C Mv (R) 


is an Azumaya Algebra over Rg of rank N? such that the natural A-algebra 
homomorphism Ro @r, R — Mn(R) is an isomorphism. 


Brief proof (and remarks). Carayol and Serre phrase their proposition in 
slightly greater generality (or at least greater flexibility) in that the domain 
may be taken to be an algebra (and not necessarily a completed group 
algebra) and the range may be taken to be a general Azumaya Algebra R 
over R (and not necessarily My(R)). Let, then, R be a general Azumaya 
Algebra over R of rank N? and Ro a subring of R which is an Ro-algebra 
such that the traces of elements of 2p lie in A-Rp and such that R-Rop = R. 
From the last condition we see that we can find a set of N? elements e; 
(j = 1,...,N?) in Ro which form an R-basis for R. For any element 
a € Ro write 
a= yo Aj 5 
with A; in R. We will show that the A;’s lie in Ro. From the displayed 


formula, we have 


(x)  Trace(a-e,) = > Aj - Trace(e; - ex) (for (k= Ajee.<N7) 
j 


Now since the matrix (Trace(e; -e,)) has a determinant which is not in 
mp (i-e., is nonzero after reduction to kK = R/mpr) and since the system 
of linear equations (*) in the “variables” A; has coefficients in Ro, the 
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unique solution (Aj,...,Ayv2) lies in Ro. It follows that the elements e, 
(j =1,...,.N?) form a free Ro-basis for Ro. 
Consequently, 


Ro @r, R- My(R) 


is an R-algebra isomorphism, and Ro is an Ro-Azumaya Algebra of rank 
N? (because 
(Ro/mMR, : Ro) Qko i R/mr -R 


is a central simple algebra over k, and therefore Ro/mr, - Ro is a central 
simple algebra over ko). 


To apply the above proposition we must know something about Agu- 
maya Algebras over R. A theorem of Azumaya [Az], [K-O] gives us that 
the Brauer group of a Henselian local ring is isomorphic to that of its 
residue field. This applies to our situation, for all our coefficient-rings A 
are Henselian and their residue fields are finite (and therefore they have 
trivial Brauer group). Thus, our coefficient rings admit no nontrivial Azu- 
maya algebras. We get: 


Corollary. Let p: Il > GLy(A) be absolutely irreducible, and let Ag C A 
be a local subring of the coefficient-ring A such that the traces Trace,(p(z)) 
for all elements x € II lie in Ap C A. Then there is a representation 
po : 11 + GLy(Ap) which, after extension of scalars from Ag to A becomes 
equivalent to p. 


But note that the ring Ap given in the Corollary may have a smaller 
residue field than that of A; i-e., the injection Ag — A may not be a 
coefficient-ring homomorphism. 


Remark. I had given a proof of the above result (cf.[M 1] 1.8 Prop. 4 
and Corollaries 1, 2) under a further hypothesis (that the one-dimensional 
cohomology of the image of @ in GLy(k) with coefficients in the adjoint 
representation Ad(p)° vanishes). That proof has the disadvantage that it 
is under this extra hypothesis and that it uses the construction of the uni- 
versal deformation ring of p. In contrast, the above result of Carayol and 
Serre can itself be used to aid in the construction of the universal deforma- 
tion ring as in Lenstra and de Smit’s construction; or in that of Rouquier, 
or Nyssen (see §7 below). Compare this also with the construction of uni- 
versal varieties of representations of algebras given by Procesi in the early 
70’s ({P 1], [P 2]). 


An idle question. From the vantage point of this section, an absolutely irre- 
ducible Gx,s representation with coefficient-ring A is given by an Azumaya 
Algebra (equivalently: total matrix algebra of finite rank) over A occurring 
as a quotient A-algebra of A[[Gx,s]]. Are there interesting classes of A- 
algebras (of infinite rank over A — analogues of “factor” occurring in the 
classical theory of Muuray and von Neumann —) which occur as quotients 
of A[[Gx,s]] and which deserve study? 
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§7. Characterizing character-functions (results of Rouquier, Nys- 
sen). In this section, let K be any commutative ring, and II a profinite 
group. By acentral function f : K|[II]] - K we mean a K-linear function 
such that f(x-y) = f(y-a) for all x,y € K[[II]]. Given a central func- 
tion f, and a positive integer m, define the function f(%1,22,... ,Zm) to 
be the signed-symmetrization of f evaluated on the products of the x; in 
all permuted ways. Explicitly, 


fm(%1,£2,---,%m) = > sign(a)- f(to(1):Zo(2)1-++ »Zo(m)) 
cTESm 


where S,, is the symmetric group on m letters. Clearly, f,, is an anti- 
symmetric K-linear function on A[[II]]™ with values in K. A central func- 
tion f is called a pseudo-character of degree N > 1 (see [Rougq}) if, 
equivalently, 


(1) fx does not vanish identically, but fyi, does vanish identically. 
(2) fm does not vanish identically for all m < N and does vanish iden- 
tically for allm > N. 


The characters of irreducible representations of finite groups II yield 
“pseudo-characters” in the above sense, as was proved by Frobenius [Fr]. 
The definition of pseudo-character given by Rouquier is a mild modification 
of the notion of pseudo-representation due to Taylor [T], which generalized 
a prior notion due to Wiles. One says that a pseudo-character f is ir- 
reducible if f cannot be expressed as the sum of two pseudo-characters 
whose degrees add up to the degree of f. For a full discussion of this the- 
ory, see loc. cit.; see also the preprint of Louise Nyssen [Ny]. See Th. 4.2 
of [Rouq] for a proof of the fact that if K is an algebraically closed field, 
irreducible pseudo-characters of degree N are precisely the characters of 
irreducible representations of II with values in K. Closely related to this 
result is a characterization of the characters of representations of II into 
GLy(K) for K any commutative ring, and in particular, any coefficient-ring 
(cf. §5, §6 of [Roug]), leading to a construction of the universal deformation 
ring by “constructing the universal pseudo-character.” 


§8. Deformations of a group representation. Let II be a profinite 
group. Suppose we are given a coefficient-ring homomorphism 


h: A, — Apo 


of two coefficient-rings. Let N be a positive integer and denote by the same 
letter 


ae GLy(A) = GLy (Ao) 


the induced homomorphism of groups of invertible N x N matrices. If 


PO - II — GLy(Ao) 
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is a continuous homomorphism, a deformation of pp to the coefficient- 
ring A, is a strict equivalence class of liftings 


re? 32. ea 


po \ [a 
GLy(Ap), 


where two liftings p, and p} are called strictly equivalent if they can be 
brought one into another by conjugation by elements of GLyn(Aj) in the 
kernel of h. 

Any representation p is, of course, a deformation of its underlying resid- 
ual representation p to A. 


Convention. It is not uncommon in the literature to use the phrase “rep- 
resentation p” to mean, at times, a specific homomorphism p and at other 
times an equivalence class of homomorphisms of which p is a member. It is 
probably best not to be too pedantic about this point, if, in every instance 
where this occurs, the context makes it clear which sense is meant, or else 
makes it clear that it doesn’t matter which sense is meant. We will try 
to make things clear in what follows, but mention here that whenever we 
use the phrase “residual representation” 7’, we mean a specific homo- 
morphism, and whenever we are interested in making a specific choice of 
a homomorphism p whose underlying residual representation is D, we shall 
refer to it explicitly as a lifting of p; if we want its strict equivalence class 
we will refer to it as a deformation of p. 

For a coefficient-ring A, consider the category C (A) whose objects are 
coefficient-rings A, together with a coefficient-ring homomorphism A, — A 
(which will be sometimes referred to as an A-augmentation) and where 
morphisms are commutative diagrams of coefficient-ring homomorphisms, 


A, — Ap» 
£ { 
A = A 


(The reason for the * in the notation is that we will later be also consider- 
ing the full sub-category C(A) whose objects are artinian coefficient-rings 
A, with homomorphism to A). Note that by our hypothesis that the A- 
augmentation is a coefficient-ring homomorphism, for all objects A; —~ A 
in€ (A) the residue field of A, is equal to k. 

Given a coefficient-ring A, a profinite group II, and a continuous homo- 
morphism 

p:Il— GbLy(A), 


define the functor D, : C(A) — Sets by the rule which assigns to any 
object A, ~ A of C (A) the set of strict equivalence classes of deformations 
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of p to A,. The phrase “the deformation problem for p” will refer to 
the study of this functor. Much of the time we will be interested in the 
case when A = k, the residue field, and p = p is a residual representation 
(the “absolute” case) but from time to time we will be dealing with the 
“relative” case, i.e., with a specific lifting of a residual representation p 
to a homomorphism p : II — GLy(A) (and not just a strict equivalence 
class of liftings of p to A) and we will be interested in deformations of p to 
coefficient-rings A, endowed with a homomorphism to A. 


CHAPTER III. THE DEFORMATION 
THEORY OF GALOIS REPRESENTATIONS 


§9. Why study “Galois” deformation theory? We will be principally 
interested in the case where II = Gx,g5 for some algebraic number field K 
and finite set of primes S in K. Here are three possible reasons for studying 
the deformation theory of representations of Gx,\5. 
1) First consider residual representations, i.e., Galois representations 
p:GxK,s — GLy(k) where k is a finite field. It takes only a finite amount 
of data to give a residual representation, and moreover, there are only a 
finite number of such residual representations (for fixed K, S, N, and k). 
Attached to a residual representation p one can consider the whole panoply 
of Galois representations which are deformations of p. If p is absolutely ir- 
reducible, any member of this panoply comes from a single neat package, 
namely from a “universal deformation” (see below) and in particular, 
from a single representation into GLy with coefficients in a single com- 
plete noetherian local ring with residue field k. This coefficient ring R(p), 
uniquely defined up to unique isomorphism by the universal property, is 
called the universal deformation ring, an explicit description of which 
(and of the universal deformation of f to it) is tantamount to a systematic 
“classification” of all Galois representations which are liftings of p. The 
spectrum, Spec R(p), will be called the universal deformation space of 
p. For some “explicit” easy examples, see [B 1] and [B-M]; for other expos- 
itory accounts of the deformation theory of Galois representations giving a 
number of examples, see [B 2], [M 1-3]. 
2) Given the universal deformation ring R(p) of a residual representa- 
tions, one can then ask which quotient rings correspond to Galois represen- 
tations with particularly desirable properties. Equivalently, we are asking 
for the closed subschemes of Spec(R(p), “the universal deformation space 
of p,” corresponding to those properties. For example: Which points of the 
universal deformation space are “modular”?? Which come as irreducible 
representations on the étale cohomology of algebraic varieties? 

The recipe for cutting down the “universal deformation” to these more 
specifically desirable Galois representations is (surprisingly enough!) at 


2That is, which such points classify representations that are “attached to modular 
forms?” 
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least conjecturally nothing more than the “imposition” of local conditions 
at the ramified primes, and sometimes with the additional prescription of 
the appropriate global determinant. For example, 

i) There is a conjecture I made with Fontaine [F-M] which says that, up 
to Q,-equivalence, the irreducible Galois representations (with coefficient- 
ring A = Z,) which come as irreducible constituents of the natural Galois 
representations on the p-adic étale cohomology of algebraic varieties (al- 
lowing integral twists) are precisely those whose restriction to the decom- 
position groups at primes dividing p are potentially semi-stable. 

ii) There is a somewhat older conjecture for N = 2, K = Q, relating 
Galois representations which are “ordinary at p” to classical modular forms 
of slope 0 (see [M 2], [M-T] and {G}). 

iii) There is the generalization of the conjectures referred to in ii) as 
formulated in [W] (still for N = 2 and K = Q). 

A good part of this generalized conjecture iii) and ii) has recently proved 

by the monumental work in [W] and [T-W], which more than amply answers 
the question posed by the title of this section. 
3) Galois representations are often systematically presented to us “in 
certain families,” these families being continuous, and they are usually 
even analytic in a p-adic sense. Hida, for example, has an extensive theory 
which shows that all Galois representations attached to classical modular 
eigenforms of slope 0 come to us in such families (cf. Hida’s book [H] and 
the bibliography there for the extensive literature about this). Based on 
Hida’s work, and on some numerical investigation, Fernando Gouvéa and 
I bad conjectured that all modular (finite slope) Galois representations 
come in specific families of this type [G-M]. This conjecture (or at least 
a qualitative form of it) has very recently been established by Coleman 
[C]. To “visualize” these families of modular Galois representations and 
specifically how these families intersect with each other and with the various 
loci describing various local conditions, it is good (perhaps even essential!) 
to be working in something like the universal space. Certain families of 
Galois representations are tightly controlled simply by understanding how 
they sit in the universal deformation space (cf. [M 3]). 

One often has some understanding of the universal deformation space. 
We shall end this section by citing two examples: 


Example 1. (An “unobstructed” case) When jp is an absolutely ir- 
reducible representation of degree two, and of odd determinant (meaning 
that if c is a complex conjugation involution in Gg, then the determinant 
of p(c) is —1) and when “the deformation theory for f is unobstructed”* 
then the universal deformation ring R(#) is isomorphic to a power series 
ring in three variables over W(k); cf. [M 1]. Here is a specific instance of 
this. Let K/Q be the splitting field of the cubic polynomial X° — X +1. 


3for a definition of the notion of “unobstructed deformation theory,” cf. [M 1] 
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The Galois group of this equation is the symmetric group on three letters, 
and K is unramified over Q at all primes other than p = 23 (and oo). Let 
p = 23. Since the group S3 has a faithful representation in GLo(F,) we 
obtain from this equation an absolutely irreducible residual representation 


B: Ga, {23,00} + Glo (Fs). 


It has been shown (cf. [M 1]) that this is an “unobstructed deforma- 
tion problem” and (therefore) that the universal deformation ring of 7 is 
isomorphic to a power series ring Zo3([é1, to, t3]| in three variables. For a 
detailed study of this deformation problem and a general class of unob- 
structed problems,.see [M 1], {B 1], [B-M], 


Example 2. (An “obstructed” case) N. Boston and S.V. Ullom [B-U] 
have studied the interesting deformation theory of the residual representa- 
tion 

PB: Ga,{3,7,00} ~ Gla(Fs) 


coming from the Galois representation on the 3-division points of the elliptic 
curve X9(49). Here the universal deformation ring is isomorphic to 


Zs|[t1, ta, ts, tal]/((1 + ta)? — 1) 


whose deformation space then is geometrically reducible, and (after the 
adjunction of a primitive cube root of unity) splits into three irreducible 
components (given by specializing 1+ ¢4 to the three cube roots of 1). 


§10. The universal “Galois” deformation ring. We mentioned that 
for absolutely irreducible representations p, there is a universal solution to 
the problem of classifying deformations of p. Explicitly, 


Proposition. If N is a positive integer and 
Pp: Gx,s = GLy(k) 


is absolutely irreducible, there is a “universal coefficient-ring” R = R(p) 
with residue field k, and a “universal” deformation, 


p'™’ : Gxs — GL (R), 


of p to R; it is universal in the sense that given any coefficient-ring A with 
residue field k, and deformation 


Pp: Gxs =? GLy(A) 


of p to A, there is one and only one homomorphism h: R — A inducing 
the identity isomorphism on residue fields for which the composition of the 
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universal deformation p""” with the homomorphism GLn(R) > GLy(A) 
coming from h is equal to the deformation p. In other terms, the functor 


_{ Coefficient-rings 
ip: es residue field ‘) —eeees 


is representable by R, 7.e., 
D;(A) = Homyy(x)-aig(R, A), 


where W(k) is the ring of Witt vectors of k. 


Easy but important exercise. If you have never worked with these concepts 
before, it is very helpful to give a direct proof of this proposition for N = 
1 and to give an explicit description of the ring R(p) and the universal 
representation 

ps Gx,s > ROP)" 


in the case when # is of degree 1 (using Class Field Theory). But the word 
“explicit” in the previous sentence should be taken with a grain of salt, 
because (if S contains all places of characteristic p) the determination of 
the Krull dimension of R(p) is equivalent to the determination of the truth 
or falsity of the Leopoldt Conjecture for p and the number field K. For all 
this spelled out, see [M 1]. 

For the proof of this proposition for all N, the reader may consult [M 1], 
[G], or [D-D-T]. Also, a very detailed discussion of all this is forthcoming 
in [D-W]. Prior to the work we have just cited, there bad already been 
numerous studies of the local deformation theory, and also of the global 
variations of representations of finitely generated groups and algebras: see 
Procesi’s [P 1] Chapter IV, Lemma 1.7, and his follow-up article [P 2]; see 
also the memoir of Lubotzky and Magid [L-M] and the other works cited 
in the bibliography by Doran (available by anonymous ftp) referred to in 
the introduction to this article. 

Let us simply list some approaches to the proof of this proposition: 


1. Via Schlessinger’s Criteria: Schlessinger, in [Sch], gives necessary 
and sufficient criteria for any covariant functor 


: ( Coefficient-rings 


—» Set 
with residue field s) ius 


to be representable, i.e., for there to exist a coefficient ring R = Rp (not 
necessarily artinian) and a “universal element” € = €g in D(R) satisfying 
the “universal property” that — for any coefficient ring A and element 
a € D(A) there is one and only one ring homomorphism R — A which is 
the identity on residue fields and which brings the “universal element” € 
to a. 
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See §18 below for a “review” of Schlessinger’s Criteria. See [M 1] for a 

proof that Schlessinger’s criteria are met within the context of the propo- 
sition above. The main “nonformal” ingredients needed to check this are, 
firstly, Schur’s lemma (which is available to us because 7 is absolutely ir- 
reducible) and secondly (to insure noetherian-ness of R) that the set of 
deformations of # to the coefficient-ring k[£] (where € is nontrivial and has 
square zero) is finite. This finiteness condition holds in our situation as 
given by the Corollary in §21. 
2. <A construction due to Faltings. If you wish to see a description 
of the universal ring in terms of generators and relations (a description 
which uses a “far-from-minimal” number of generators and relations, but 
which has the virtue of being explicitly given in terms of the data) there is 
a construction of R, and hence also a proof of representability of Dp, due 
to Faltings which does exactly that. For an account of this construction, 
see for example pp. 56,57 of [D-D-T]; also, the forthcoming [D-W]. 


3. A construction due to Lenstra and de Smit. For this, see their 
article [L-de-S] in this volume. 

4, Via Universal Characters. Another attitude towards the statement 
of the proposition above is that it guarantees the existence of a “universal 
character function” (together, of course, with a “universal ring R” acting 
as value ring for this character function). Conversely, Rouquier and Nyssen 
approach the construction of universal deformation rings by dealing directly 
with pseudo-characters using the results of [Rouq], [Ny] described in §7 
above. One shows that the properties of being a character function bas 
a universal solution, thereby giving another construction of the universal 
deformation ring. 


811. An alternative description of the deformation problem for 
group representations (in a slightly more general context). Let II 
be a profinite group which satisfies the p-finiteness condition of §1. Let 
k be a finite field of characteristic p. Let V be a finite-dimensional k- 
representation space for II (and we assume that the action of II on V is 
continuous). If B is a coefficient-ring with residue field k, by a deforma- 
tion V of V to B let us mean a couple (V,@) where V is a free B-module 
(of finite rank) with continuous G-action and a: V @g k & V is an iso- 
morphism as I]-representation spaces. By D7(B) let us mean the set of 
isomorphism classes of deformations of V to B; view Dy as covariant func- 
tor from the category of coefficient-rings with residue field k to the category 
ae Coefficient-rings 
efficient-r 
M* Gore residue field ) ee 

By fixing a k-basis of V one may identify the automorphism group 
Aut,(V) with GLy(k) where N = dim,(V) and the Il-action on V then 
gives us a specific (continuous) residual representation p : II — GLy(k). 
One then sees directly from the definitions that there is an isomorphism of 
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functors, Dy = Dz. The relative problem can also be phrased this way: If 
A is a coefficient-ring with residue field k, and V is a fixed free A-module 
of rank N with A-linear continuous II-action, and if p : II ~ GLy(A) is 
the continuous homomorphism obtained from V by choosing an A-basis, 
then letting 
(ores coefficient-rings 
with residue field k 

denote the functor which associates to an A-augmented coefficient-ring B 
with residue field & the set of isomorphism classes of pairs (V,a) where 
V is a free B-module of rank N endowed with a B-linear continuous [I- 
action, and a: V @g A& V is an isomorphism of A[[T]|-modules, we have 
a natural isomorphism of functors D, = Dy. 

Now let us return to the absolute deformation problem. Let V be a finite- 
dimensional k-representation space for II (the action of II being assume 
continuous) such that the natural mapping 


) —+ Sets 


k > Endy (V) 


is an isomorphism. This would be the case, by Schur’s Lemma, if 6 were ab- 
solutely irreducible; cf. the Corollary of §4. But there are other important 
examples of representations 9 which satisfy the above condition without 
being absolutely irreducible. Specifically, let 


be a representation equivalent to a representation of the form 


aa) =[*9 *2 


which is not semisimple (equivalently: such that the image of pf is of order 
divisible by p), and such that one of the two characters y or 77 is nontrivial. 
Examples of such representations may be found among the residual repre- 
sentations attached to elliptic curves with ordinary reduction over p-adic 
number fields. Then the II-representation space V attached to 7 is not 
absolutely irreducible, and yet does satisfy the condition displayed above. 

The representability proposition of the previous section is valid in this 
context, that is to say, 


Proposition. Dy is representable; i.e., there is a coefficient-ring R with 
residue field k, and a finite free R-module Vp endowed with a continuous 
II-action which is a deformation of V to R which is universal in the sense 
that any deformation V of V to any coefficient-ring A with residue field 
k comes from Vp by tensor-product via a unique homomorphism R — A 
(which induces the identity on residue fields): 


VSVRO@RA. 
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§12. Representations with coefficient-rings which are A-algebras. 
Fix a coefficient-ring A with residue field k. For a given profinite group II 
and finite-dimensional k-vector space V with continuous I-action, with II 
and V satisfying the conditions formulated in §11 above, let us ask for defor- 
mations of the representation V to coefficient-rings A which are A-algebras 
(where the structural algebra homomorphism A — A is a coefficient-ring 
homomorphism). Let Dy, denote the “restriction” of the functor Dy to 
the category of such coefficient-ring A-algebras, i.e., the functor 


Coefficient-ring 
DF a: A-algebras —> Sets 
with residue field k 


associates to the A-algebra A, the set Dy(A) of isomorphism classes of 
deformations of V to A. 
Letting R denote the universal deformation ring of the Il-representation 


V (whose existence is guaranteed in the proposition of §11) then: 


Proposition. The functor Dy, is representable by R®w yA, where ® 
means “completed tensor product” . 


Proof. Before we engage in the proof proper, let us take a minute to re- 
view the notion of “completed tensor product.” The reason for its involve- 
ment in the above proposition is because the (standard) tensor product 
Ry @w kr) Re of two coefficient-rings, R; and Rz (over W(k)) is not neces- 
sarily a coefficient-ring: it need not be complete. The simple remedy is to 
complete R; @w x) Re with respect to the ideal 


m := ker(R, @w kr) Re — k); 


one sees easily that m = m, @w xr) Ro + Ry @we) M2 where m; C R; 
(i = 1,2) are the maximal ideals. The completion Ri®yx)R2 has the 
following two descriptions 


Ry Ow (k) Ry = proj. lim.(R, @w(k) Rz)/m” 


= proj. lim.(Ri/my) @w(e) (Re/m3), 

andifmc R, @wk) Re denotes the closure of m, one sees that R, Qw(k) Ro 
is again a complete noetherian local ring with maximal ideal m and with 
residue field k. In particular, the category of coefficient rings (with residue 
field &) is closed under completed tensor product. 

Concretely, if Ry and Ry are the quotients of the power series rings 
W(k)|[t1,---,2s]] and W(k)[[y1,-.- , ye]] by the closed ideals generated by 
the power series 


fiserea du @ W(k)|[eiy:<s5 2s)] “and 91,20 590°€ WE) | ly, 25~s vel 
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respectively, then Ri @w ky Re is isomorphic to the quotient of the power 
series ring W(k)[[t1,---,2s,41,--- ,Ye]] by the closed ideal generated by 
the v + power series f, fo,--- , fur G1, 92s--+ 1 Gp- 

The proof of the proposition comes from reviewing the definitions in- 
volved: the ring ROw( x) A is a “coefficient-ring and a A-algebra” (a “coeffi- 
cient-A-algebra” for short) and carries a deformation of V induced from 
the universal deformation of V to R. Moreover, any deformation of V to 
a coefficient-ring A-algebra A is induced from the universal deformation 
to R via a unique homomorphism R — A which extends to a unique A- 
algebra homomorphism R®w nA — A, establishing the required universal 
property for R@w iA. 


From now on in these notes, we shall be fixing a coefficient-ring A with 
residue field k of characteristic p, and we will work with A as base ring. 
That is, we deal with the category whose objects are coefficient-A-algebras 
A and morphisms are homomorphisms of coefficient-A-algebras: we will 
study representations with these A-algebras as coefficient-rings. The “de 
fault” base ring A is, of course, just W(k), as discussed in §2. 


§13. Is there a relationship between the (formal, say deformation 
space of a variety V defined over a number field K, and the de- 
formation space of the various Galois representations occurring 
in the étale cohomology of V? 

No. These seem to be quite different animals. 
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Part Two 


CHAPTER IV. FUNCTORS AND REPRESENTABILITY 


§14. Fiber products and representability. Fix A a coefficient-ring 
with residue field k of characteristic p. Denote by Cn(A) the category whose 
objects are coefficient-A-algebras which are endowed with a coefficient-A- 
algebra homomorphism to A. Let Ca(A) denote the full subcategory of 
C,(A) whose objects are artinian coefficient-A-algebras (again endowed 
with an A-augmentation. i.e., a coefficient-A-algebra homomorphism to 
A). If A is the residue field k, let us drop it from the notation, i-e., Ca(k) 
and Ca(k) will be denoted Ca and Ca, respectively. The reason for the ~ 
notation is that any coefficient-ring A may be written as the projective 
limit of artinian ones: 
A = proj. lim. A/m’4. 
TCO 

If we are out to prove that a given functor, call it D (say on the larger 
category Ca) is representable (as we shall be!), the representing coefficient- 
A-algebra, call it R, is completely determined by the restriction of the 
functor to the smaller category Ca. This is true because 


Hom(R, A) = proj. lim. Hom(R, A/m) 


as sets. It is convenient to do most of our work directly with the smaller 
category Ca if our functors D satisfy the property that 


(1) D(A) = proj. lim. D(A/m‘%) 


Tl—0o0 
for all coefficient A-algebras A. Call such a functor continuous. A con- 
tinuous functor on Ca is determined by its restriction to Ca. 

Schlessinger calls functors on Ca which are represented by objects of 
the larger category Cy pro-representable (as is only fitting, since they are 
represented by projective limits of objects on the category on which they 
are defined) but we will often drop the prefix “pro-”. 

Given a diagram of sets, 


(2) A B 


aN # B 
C 


the “fiber-product” A xc B is the subset of the product Ax B consisting of 
all couples (a, 6) such that a(a) = 6(b). The fiber-product A xc B “comes 
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along” with projections to A and to B, and fits into a diamond 


AxcB 
aa 
(3) A B 
aN # B 
C 


It is useful to have the accompanying notion of Cartesian diagram (of 
which (3) is the prototype). One says that a diagram of sets 


E 
oe 
(4) A B 
a Ne A 1D 
C 


is cartesian if the pair of mappings # — A and EF — B identify the set 
E witb the fiber-product A xc B; i.e., if the diagrams (3) and (4) are 
isomorphic (the isomorphism being the identity on similarly labeled sets 
and mappings). 

The notions of fiber-product and cartesian diagram are “categorical” in 
the sense that if, instead of starting with the diagram of sets (1), we start 
with a diagram (5) of set-valued covariant functors on any category C, 


(5) A B 
an £ 8 
C. 


then the same definitions allow us to talk of the fiber-product Axc¢B whose 
value on any object X of C is given by the fiber-product of the values of A 
and B on X,ie., 


(6) (A xc B)(X) = A(X) xqcx) B(X), 
giving us a diagram of functors 
Ax 
VA 
(7) A 
a‘, 
Cc 
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and allowing us to say, in analogy with our discussion for sets, what it 
means for a diagram of functors 


E 
a 
(8) A B 
Que 8 
Cc 
to be cartesian. 

Even if A,B,C are representable covariant functors on C, (representing 
objects A,B,C) the fiber-product fanctor A xc B may or may not be 
representable in C; but if it is representable, its representing object, called 
AxcB8, and coming along with a pair of morphisms Axc B — A, AxcB — 
B, is well-defined up to unique isomorphism in C. If this is the case, 
colloquially one says that the fiber product A xc B “exists” in C, and we 
get a (cartesian) diagram in C: 

A xo B 
oS 
(9) A B 
a it~ BD 
CG. 


The prototypical example. If you are not familiar with the notion of 
fiber-product, it might be helpful to note that fiber-products (as defined 
for any category above) do indeed exist in the category of sets, and these 
fiber-products are given by the construction given in diagram (3) above. 
Fiber-products also exist in the category of commutative rings and are 
given by the analogous construction. 

When fiber-products “exist,” we may use the bijection (6), turning it 
around a bit, to provide for us a powerful necessary condition for repre- 
sentability. Specifically, suppose that we have a covariant set-valued functor 
F on our category C. Applying F to diagram (9) gives a diagram of sets. 


F(A XC B) 

ZS 

(10) F(A) F(B) 
Soaks 22 
F(C) 

which, if #' were representable (say by an element X of C) would be carte 
sian by (6), i-e., the mapping 
(11) h: F(A XC B) — F(A) * F(C) F(B) 
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would be a bijection. 

The “earmark” of representability, then, for a functor Fis the property 
(which I shall refer to as the Mayer-Vietoris property) that the mor- 
phism h of (11) above is a bijection for all cartesian diagrams (9) of the 
category C. This is germane to our situation for we have the easy 


Lemma. Let A be a coefficient-A-algebra. Fiber products “exist” in the 
categories Ca(A). 


Specifically, if 


Ss ae 
C 


is a diagram of artinian A-algebra coefficient-rings with A-augmentation, 
then the subring 
AxcBcCcAxB 


consisting of elements (a,b) such that a(a) = G(b) is again a coefficient- 
A-algebra which is artinian. It inherits an A-augmentation, and is the 
categorical fiber-product. 

I am thankful to Brian Conrad for explaining to me that the larger cat- 
egory Cn(A) is not closed under fiber products, the problem being that 
the fiber product of elements in Ca(A) need not be noetherian. He sug- 
gested the following example. Let k be a field, A = A[[X,Y]], B = k, 
and C = k[[X]], ic. A and C are the power series rings in the indicated 
variables over k. Mapping the k-algebra k[[X,Y]| to k[[X]] by sending Y 
to 0, and mapping the k-algebra k to k[[X]| in the unique manner, we get 
a diagram 

A=K[X,Y]| k= 3B 


Se at 
C= xX 


and the fiber-product A xc B is given by the sub-ring k @ Y - k[[X,Y]] in 
k[[X, Y]]. The maximal ideal of A xc B is Y - k[LX,Y]], and the Zariski 
tangent space of A xc B may be identified with the k-vector space k{[X]], 
which is infinite dimensional; i.e., A xc B is not noetherian. In the special 
case where both A — C' and B -— C are surjective morphisms in the 
category Ca then the ring A xc B is noetherian (see ex. 3.2 of [Mat]) and 
is again in Cre 


§15. A functor’s-eye view of the Zariski tangent space (the “ab- 
solute” case). Fix A a coefficient ring and R a coefficient ring A-algebra. 
Denoting their maximal ideals ma C A and mp C R, let us recall the 
definition of t, = t} nt the “Zariski cotangent space” of the A-algebra R, 


th = mr/ (mz + (image of ma) - R). 
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The intuition behind this definition is that if one thinks of R as being 
“functions on some base-pointed space,” then mr may be thought of as 
those functions vanishing at the base point, and t} is the quotient of mp 
by the appropriate ideal (of “higher order terms” of these functions) so as 
to isolate the “linear parts” of these functions. “Linear” is a key word here, 
for t® is naturally endowed with the structure of R/mr = A/ma module, 
i.e., t® is a vector space over k. As is only fitting one defines the Zariski 
tangent (k-vector) space to R to be the dual k-vector space, 


tr = Hom, (mp/(m% + ma- R),k)- 


Since A is noetherian, t} is a finite-dimensional k-vector space and so tr 
is naturally the k-dual of t} thereby justifying the notation. 

It will be important for us to give a definition of the k-vector space tr 
using only the covariant functor, call it Fp, which is represented by R, i-e., 
the functor 

Bre DpR(B) = Home, (R, B) 


for B in Ca. The key idea is to invoke the A-algebra k[e] defined by the 
relation «7 = 0. The algebra k[e] is a vector space of dimension two over k, 


(1) ke] =k @e-k, 


the first subspace in the above direct sum decomposition being generated 
by the unit element of the algebra k[e] and the second subspace being the 
maximal ideal (which has, of course, square zero). 


Proposition. There is a natural isomorphism of k-vector spaces 
(2) Homky.sp(Mr/(m} + ma - R), k) = Homa.aig(R, k[e]). 


(If you have never seen this before, it is more instructive to try to do this 
as an a exercise, rather than to read the proof below.) 


Proof. Since the maximal ideal of k[e] has square zero, the natural mapping 
(3) Home. alg(R/(m} + ma - R),k[e]) + Homp_aig(R, k[e]) 


is a bijection. Now the k-algebra R/(m2,+ma- R) has a natural direct 
sum decomposition 


(4) R/(m_t+ma: R)=kOmpA/(m_+ma- R) 


the first subspace in the above direct sum decomposition being generated 

by the unit element and the second subspace being the maximal ideal. 
Clearly then, any A-algebra homomorphism from R/(m?, + ma - R) to 

k[e] must respect the direct sum decompositions (1) and (4) and (since the 
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homomorphism is constrained to be the identity on the first summand, but 
may be any k-vector space homomorphism on the second) we have 
Homp-aig(R, k[e]) = Homg.y.sp(mr/(m +mqa-R),¢-k). 

Identifying the k-vector space €- k with k, and combining the above iso- 
morphism with (3) yields (2). 

This lemma does give us a functorial interpretation of the relative Zariski 
tangent space, namely: 
(5) tr = Drlkle)), 
and allows us, jumping the _gun a bit!, to make the following definition. 
Definition. Let D: Ca — Sets be any covariant functor such that D(k) 


consists of a single element. Then the “Zariski tangent (k-vector) 
space” of D, denoted tp, is the set D(k[e]). 

In this generality, we cannot yet guarantee a natural k-vector space 
structure on the set tp (and this is what I meant by saying that we have 
“jumped the gun”). Nevertheless we can already see the structure of “scalar 
multiplication by k” on tp. Namely, let us notice that the multiplicative 
monoid Enda_aig(k[e]) acts on the set tp = D(k[e]) by functoriality of D 
and we have a natural ring homomorphism 
(6) k © Enda-aig(lel) 

ar A, where ag(z@y-€)=2r2@Oa-y-e. 

This multiplicative action of k will be the scalar multiplication in the 
eventual k-vector space structure of tp in the special case where this vector 
space structure can be defined. Let us also point to the structure which 


will give rise to the law of vector-addition. Namely, define the k-algebra 
homomorphism which we will simply label “+”: 


kle] xx kle] > k[e] 
(c@y-€,2@ yo-€) +> FO (y1 + ye) -€ 


We need a further hypothesis concerning our functor D. We call it (Tx) 
— for “Tangent Space Hypothesis.” 


(T,) The mapping h: D(k[e] x; k[e]) > D(K[e]) x D(K[e]) 
is a bijection. 


(7) 


If D satisfies (T;,) we define vector addition on the tangent space of D 
by the composition 
-1 


D(kfe]) x D(kle]) "% — D(le] xx blel) 
(8) a‘ 


2M) D( kel) 


tp Xtp a aa tp. 
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§16. The Zariski tangent A-module (the “relative” case). Suppose, 
now, that we are in the relative case. That is, we have fixed a coefficient 
A-algebra A, and a covariant functor 


D:C€,(A) > Sets 


such that the value of the functor D on the A-augmented coefficient~A- 
algebra A itself, D(A), is a single point. Let A[e] denote the coefficient-A- 
algebra A[T]/(T?) where e = T mod (T”). Then Ale] is free of rank 2 as 
an A-module, with A-basis {1, €}: 
(9) Ale] :=A@®e-A 

We view .A[e] as an A-augmented coefficient-A-algebra, where the aug- 
mentation mapping is passage to the quotient by (e); i.e., it is the projection 
to the first summand in (9). Then, as in §15, the object Ale] of C4 is “an 
A-module-object” in C4, in the sense that Ale] admits an “addition law” 


(10) Ale] x4 Ale] =e Ale] 
(r@y-€,eByo-€) +> TO(yi +ya)-€ 
and it may be endowed with scalar multiplication by elements a of A 
Ale] —¥ Ale] 
(11) 
(c@®y-€) + (c#@a-y-e) 

where these operations satisfy, formally, all the properties of “A-module 
operations.” Note also that Ale] x4 Ale] is again a coefficient A-algebra 
(the point being that it is still noetberian; see the discussion at the end 
of §14). 

Now suppose that D satisfies the “Tangent A-module Hypothesis”: 

(T4) Themapping Ah: D(Ale] x4 Ale]) — D(Ale]) x D(A[e]) 
is a bijection. 
Then we make the analogous 
Definition. The Zariski tangent A~module, denoted tp,4, or tp if A 
is understood, is given, as a set, by 
tp,A c= D(Alfe]) 


and it inherits an A~module structure as follows: 
Letting D(+) denote the mapping induced from (10) above, the addition 
law in tp is given by the composition 


D(Alel) x D(Ale) “> D(Ale] x4 Ale) 2 D(Ale), 


while scalar multiplication by a € A is the mapping induced from (11) 
above. 


Since any coefficient~A-algebra has a natural “k-augmentation,” we may 
think of the “absolute case,” treated in the previous section as a particular 
example of the “relative case” under discussion bere, just by taking A = k. 
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§17. Continuous Kahler differentials and the (relative) Zariski 
Tangent Space. Let R be a coefficient-A-algebra. Let ORT denote the 
R-module of relative (continuous) Kahler differentials of the A-algebra R. 
We will refer to this module simply as .z7,. For a discussion of the notion 
of Kahler differentials, and for a list of its basic properties see [Mat] and 
[Hal], but note that we are working here in a slightly different category 
than is dealt with in those references. Specifically, we will be dealing ex- 
clusively with (topologically profinite) complete noetherian rings, and we 
will demand that our Kahler differentials respect the appropriate topolo- 
gies. For a reference that does things in such a topological context, see [Gr] 
(Chapter 0, §20). Intuitively, Qp/, is the R-module packaging the “maz- 
imum amount of first-order infinitesimal information about the A-algebra 
R.” Somewhat more formally, the R-module {2R/, comes along with a 
(continuous) derivation 


(11) d:R->Qpa 


relative to A (ie., such that dA = 0) and it is universal for this structure, 
ie., it is the universal (topologically profinite) R-module equipped with 
(continuous) derivation from R, relative to A. That is, for any topological 
R-module M. we have a canonical isomorphism 


(12) Homp-moa(@r/a,M) = Der,_ajg(R, M) 


where Der,alg(R, A) is the R-module of continuous derivations from R to 
the R-module M. 

Somewhat more concretely, it can be constructed as follows: Let R@,R 
be the completed tensor product of the coefficient-A-algebra R with itself, 
over A. Then Ra Ris again a coefficient-A-algebra, and the multiplication 
homomorphism 

u:R®vARAR 
crQyreonu-y 


is a surjective homomorphism of coefficient-A-algebras. Let [ C R@jAR 
denote the kernel of y. Since y is continuous, J is a closed ideal in R@, R. 
We have a continuous homomorphism of A-modules 6: R — I defined by 
the equation 6(r) =r®1—1@r for r € R. Since R®, R is noetherian, I* 
is a closed ideal. The (topologically profinite) R®, R-module structure on 
I/I? “factors through” a canonical R-module structure on I/I? (via the 
topological identification R ®, R/I = R) and the mapping 


(13) d:R—1/I? 


obtained from 6 by projection J — I/I* is easily seen to be a derivation 
relative to A. One checks that the continuous derivation (13) is indeed 
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“universal” for continuous derivations of R relative to A. Therefore we 
may take Qr/, = I/I?, and (13) provides us with a construction of the 
universal derivation (11). 
Exercises. 1) Let P be the power series ring over A, 

PSN Ravecnglealls 


Show that Qp/, is canonically isomorphic to the free P-module of rank n 
generated by elements dX,,... , dX, and show that the universal derivation 
d: P + Qpyp, is the standard differential 


d:P—> P-dX,®@P-dX.@-:-@P-dXn 


on power series (with d\ = 0 for all 4 € A). 
2) Let R be given as the quotient ring of the power series ring P in 1) 
by the ideal J C P generated by m elements, 


R=AUX1,.-. ,Xnll/(fas--- + fr): 


Show that Qr/A may be presented as the quotient of the free R-module on 
n generators dX1,... ,dX, by the sub-R-module generated by the images 
of dfi,... , df, under the projection 


P-dX,@P-dXo®...P-dX, —+> R-dX,@R-dXo@...R-dXy. 


In particular, Qp/, is an R-module of finite type. 

3) For Racoefficient-A-algebra, show that there is a natural identification 
between the k-vector space Qg/,@rk = Ora ®prk and the cotangent space 
of R relative to A, i-e., with 


tR/A c= mr/(m> +ma,- R). 


Give the derivation from R to t} ik which corresponds, under this identifi- 
cation, to the projection of the mapping d: R > Ori, to Op, Or k. 

In the case where the functor D of the last section is prorepresentable 
by an A-augmented coefficient-A-algebra R, then D does satisfy hypothesis 
(T4). To record the dependence of the relative Zariski tangent space 


tp,a = D(Ale]) 


on the representation po: R — A, we find it useful sometimes to adopt the 
alternate notation tp,4 = tp,». We have the following description of tp, 
which follows directly from the definition. 


(14) tp,p = the subset of Homp-aig(R, Ale]) consisting of 
those A-algebra homomorphisms whose composition 
with the projection Ale] + A is equal to p. 


The A-module tp,, may also be obtained directly from the R-module of 
(continuous) Kahler differentials Qr/,. Specifically, 
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Proposition. Let D be represented by the coefficient-A-algebra R (so, in 
our usual terminology D = Dr) and let Qrj, denote the R-module of 
relative continuous Kahler differentials of the topological A-algebra R. Let 
(A,p) be a pair where A is a coefficient-A-algebra in Cx and p: R— A 
is a coefficient-A-algebra homomorphism (i.e., p an element in D(A) := 
Homa-ag(R,A)). We view A in this manner as R-algebra. 

Then we have a natural isomorphism of A-modules: 


Hom,-moa(Qr/a OR A, A) & t,p- 


Note. In particular, if A = k and p = 7, the origmal residual representa- 
tion, so that tpz is the Zariski tangent (k-vector) space attached to D, we 
have: 


Hom, (Ora QR k, k) = tR/A = to z- 
Proof of the Proposition. We bave these canonical isomorphisms: 


(15) Hom4-moa(Qr/a @r A, A) = Homg-moa(QR/a, A) 
(16) Homp-mod(%r/arA) % Dera(R,A) (by (12). 


Moreover there is a natural injection 
2: Deta(R, A) > Homp-aig(R, Ale]) 


which sends a derivation 6 € Der,a(R, A) to the A-algebra homomorphism 
ps: R—- Ale] given by 


ps(r) = p(r) Be -d(r)€ AGe- A. 


The injection 2 identifies Dera(R, A) with the subset of Homa_a1,(R, Ale]) 
consisting of the A-algebra homomorphisms from R to Ale] such that com- 
position with the projection Ale] — A yields 9p: R > A. By (14) we then 
have a bijection of sets 


(17) Dera(R, A) = tp», 


and checking back on the definition of the R-module structure of tp,, one 
immediately sees that (17) is an isomorphism of R-modules. Putting (15)- 
(17) together yields the proposition. 


§18. Schlessinger’s representability theorem. To get us in the mood, 
let us begin with a result which is easy to prove (it is a good exercise) but 
which is sometimes difficult to use because its hypothesis is hard to check: 
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Grothendieck’s Theorem. Let D : Ca — Sets be a covariant functor 
such that D(k) consists of a single element. Then D 1s pro-representable, 
1.e., D = Dr for some coefficient ring R in Ca, if and only if D satisfies 
the “Mayer-Vietoris Property,” i.e., the mapping 


h: D(A XC B) — D(A)x) pe) D(B) 
is an isomorphism for all diagrams 


(8) A B 


BONG ae 8 
Cc 


in Ca and the Zariski tangent space tp, is finite dimensional over k. 


In contrast to this Theorem, which requires the Mayer-Vietoris Property 
for all diagrams, the theorem of Schlessinger formulated below artfully cuts 
down the number of diagrams for which one must check the Mayer-Vietoris 
Property. To prepare for this:— 


Definition. A mapping p : A — C in Ca is small of if its kernel is a 
principal ideal annihilated by m4. 


Schlessinger’s Theorem. Let D: Ca — Sets be a covariant functor such 
that D(k) consists of a single element. Then D is (pro)-representable if 
and only tf these four conditions holds: 
(Hj) hts surjective if A— C is small (or equivalently: h is surjective if 
A-—C is surjective). 
(Hz) h is bijective if A> C iskle] > k. (Note: (H2) implies hypothesis 
(Ty) of §13 and therefore it implies that the “Zariski tangent space” tp, 
is naturally endowed with the structure of k-vector space). 
(H3) Hypothesis (Ty) holds and dim,(tp,x) is finite. 
(H4) A is byective if A— C and B — C are equal, and small. 
For a proof, see [Sch]. 

In view of our assignment in this conference, zt is almost less important 
to us that our functors be representable than that they satisfy hypothesis 


(T'4). This motivates us to make the following definition (which we state 
in the relative case): 


Definition. Fix a coefficient-A-algebra A. A contravariant functor 
D:C,(A) — Sets 


such that D(A) is a single element will be called nearly representable if 
it satisfies hypothesis (T'4) of §16, together with the following “finiteness” 
hypothesis (F). 

(F) The relative Zariski tangent A-module tp,, is of finite type. 
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Although many of the functors of interest to us in this article will not 
be representable, they will all turn out to be nearly representable, and also 
they will all satisfy (H 1), (H 2), and (H 3). It is condition (H 4) that 
will, at times, not be satisfied. Functors satisfying (H 1,2,3), have an 
important property which at times, is a reasonable consolation even when 
they are not prorepresentable. Such functors are, in any case, very nearly 
pro-representable: a functor D satisfying (H 1,2,3) has, in Schlessinger’s 
terms, a pro-representable hull (Def. 2.7 of [Sch]). To describe this 
notion we must define what it means for a morphism of functors € : D’ — D 
on Ca (such that D’(k) and D(k) are singletons) to be smooth. The 
morphism € is smooth if it satisfies the following “lifting property”: given 
any surjection B — A in Ca, any element a’ € D’(A) and any lifting 
of a = €(a’) € D(A) to an element 6 € D(B), there exists an element 
G3’ € D’'(B) which is a lifting of a’ such that €(6’) = @. Equivalently, we 
may phrase this lifting property as the condition that the natural mapping. 

D'(B) + D(A) x(a) D(B) 

be surjective for all surjections B — Ain Cy. It is also equivalent to request 
the same surjectivity property for surjections B > A in Ca. For example, 
if D’ — D is smooth, then it follows that D’(B) — D(B) is surjective for 
every BinCa, (proof: use the lifting property with A = k). For a list of the 
basic properties of smooth morphisms of functors, see Prop. 2.5 of [Sch]. 
Definition. A pro-representable hull for D is a pair (R,¢) where R is 
a coefficient-A-algebra, and e : Dr — D is a morphism of functors (where 
Dp is the functor pro-represented by R) satisfying two properties: 

(i) the morphism of functors e: Dr — D is smooth. 

(ii) the induced mapping of Zariski tangent spaces 


trR— tp 
is an isomorphism of k-vector spaces. 


Schlessinger proves that any (covariant, Set-valued) functor D on Ca 

such that D(k) is a singleton, and which satisfies (H 1,2,3) possesses 
a pro-representable hull (R,¢) and, moreover, any two pro-representable 
hulls of D are isomorphic (but they are, in general, only “noncanonically” 
isomorphic). 
§19. Relatively representable subfunctors. Let us be given two co- 
variant functors D C D from the category Ca, to sets with D a subfunctor 
of D (and, in particular, D(A) is a subset of D(A) for all objects A of 
Ca) such that D(k) = D(k) is a single element. Let us say that D C D is 
relatively representable if for all diagrams in Ca, 


(8) A B 


MSF 8 
C 
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the square 
D(A xc B) —*= D(A) xno) P(B) 


c| |< 
D(A xo B) —*— D(A) x pie) D(B) 


is cartesian. 
The terminology “relatively representable” is justified by the 


Exercise. In the above context, if D C D is relatively representable, then 
D satisfies (H;) if D does (this is true for each of the 7’s ( = 1,2,3,4) 
separately). Also, D satisfies T, if D does, and D is nearly representable 
if D is. If D is representable by a coefficient A-algebra Rp then D is 
representable by a quotient-A-algebra Rp of Rp. 

A hint for this statement is given by the following fact: 


Lemma. Let R be a coefficient-A-algebra, and Dp : Ca — Sets the functor 
represented by R; i.e., Dr(A) := Homa(R,A). Let py: Ri — Ro bea 
homomorphism of coefficient-A-algebras and denote by ~ again the natural 
transformation of functors on Ca which is induced by y, p: Dr, — Dr,. 
Then these two properties are equivalent. 

(i) The ring-homomorphism ~: R, — Rg is surjective. 

(ii) The natural transformation p : Dr, — Dr, is injective; t.e., we 

may identify Dr, with a subfunctor of Dr,. : 


Proof of Lemma. Clearly (i) implies (ii). To see that (ii) implies (i) note 
that since y is a homomorphism of coefficient-A-algebras it induces the 
identity on residue fields. Since R; are complete noetherian local rings 
it then suffices to show that y: R, — Re is surjective on Zariski tangent 
spaces, or, since these k-vector spaces are finite-dimensional, we must show, 
dually, that the mapping induced by ¢, 


Dr, (kle]) > Dr, (Ale), 
is injective, which it is by (ii). 
§20. Representability results regarding deformation problems at- 
tached to group representations. Let II be a profinite group satisfying 


the p-finiteness condition, k a finite field of characteristic p, A a coefficient- 
ring with residue field k, and 


p:Il— Gbin(k) 


a continuous representation. Recall the “absolute” A-deformation problem 
for p. This is given by the functor 


(1) Dz: Cy — Sets 
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which associates to each coefficient-A-algebra B the set D3(B) of deforma- 
tion of p to B. 

Also, for a given choice of lifting p : II > GLy(A) of p to a coefficient-A- 
algebra A (i.e., g is an actual homomorphism, not just a strict equivalence 
class) we have the “relative” A-deformation problem (relative to this lifting 
p), given by the functor 


(2) D,: Ca(A) = Sets 
which associates to each A-augmented coefficient- A-algebra B the set 
D,(B) := the set of deformations of p to B. 


If A is a coefficient-A-algebra and n a positive integer, let A, be the 
artinian quotient coefficient-A-algebra, A, := A/m%. 


Proposition 1 (Continuity). Let p : II + GLn(A) be a lifting of B to 
a coefficient-A-algebra A, and pp : Il > GLy(An) the induced continuous 
homomorphism forn >1. The functor D,: C(A) — Sets is continuous in 
the sense that it satisfies (1) of §14; 1.e., 


D,(B) = proj. lim. D,, (Bn) 


for all A-augmented coefficient-A-algebras B. The functor D5: Cx — Sets 
is continuous. 


Proof. Let i: D,(B) — proj.lim. D,,(B,) denote the natural mapping, 
which we will show to be bijective. We adopt the interpretation of the 
functor D, (and of the functors D,,) given in §11. That is, letting W be 
the underlying A-module of rank N endowed with the [J-representation p, 
then for any A-augmented coefficient- A-algebra B, the set D,(B) is the set 
of isomorphism classes of pairs (V,a) where V is a free B-module of rank 
N, endowed witb continuous (B-linear) Il-action, anda:V@gA— W 
is an isomorphism of A|[II||-modules. We use the same notation with the 
subscript “n” to describe the sets D,, (Bn). Let {(Vn,an)}n>1 denote a 
cofinal system in the projective system {D,,(Bn)}n>1. So, for each n > 1, 
Vz, is a B,,|[II]|-module, free of rank N over B, and 


On: Vn @B, An — W @4 An 


is an isomorphism of A|[II]|-modules. Cofinality is expressed by the ezis- 
tence of an isomorphism 3, of B,[[II||-modules, 


Bn : Vases @Bnar Be => Ve 
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An+1 
Vn+1 @Bn41 Anti ——> W @a Ans 


Bn Ory | 1@an | 


Va QB, Ax res Ww @A Ag, 
where 7, is the natural projection. Compiling the B[[IT]|-modules V,, via 
the compositions of the natural projections Vnz+; — Vn41 @B,41 Bn and 
with (a choice of) Bn for each n > 1, we see by Nakayama’s Lemma that 
the B{[II||-module obtained in the (projective) limit, 


V := proj. lim. V,, 
is a free B-module of rank N. The limit of the a/,s gives us an isomorphism 
of A[{I]|-modules a: V @g A— W. 
This proves surjectivity of the mapping 7. As for injectivity, let (V, a) 
and (V’, a’) be two representatives of elements in D,(B) such that we have 


isomorphisms ‘yn : Vn — V,, of B,[[{I]|-modules for each n > 1, such that 
the 7,,’s “fit” into commutative diagrams. 


Vesey. Ay es Wey. Ay 


| P 


Wr aaa Wr 


Then the projective limit of the y,’s yield an isomorphism between the 
couples (V, a) and (V’, a’). 


In view of Proposition 1, the functor D, (resp. Dz) is representable if 
and only if its restriction to the subcategory Ca(A) of artinian objects (resp. 
Ca) is “pro-representable.” Regarding the absolute deformation problem, 
we have: 


Proposition 2. Let p: Il + GLy(k) be a continuous residual representa- 
tion, with k a finite residue field of characteristic p, and II a profinite group 
satisfying the p-finiteness condition. Fiz A a coefficient-ring with residue 
field k. 

(i) The functor Dz (restricted to Ca) satisfies (Hi), (H2), (H3). 

(ii) If p is absolutely irreducible, then the functor Dz is representable. 


Our relative deformation problems are all “nearly representable” and we 
shall formally state this fact in two “strengths”: 


Proposition 3a (“weak near representability”). For every artinian 
coefficient A-algebra A, and every lifting p : Il > GLy(A) of p to A, the 
relative functor D, is nearly representable in the sense of §18. 


and 


282 B. MAzuR 


Proposition 3b (“strong near representability”). For every coeffi- 
cient A-algebra A, and every lifting p: II > GLy(A) ofp to A, the relative 
functor D, is nearly representable in the sense of §18. 


We have separated the two statements above because the strong state- 
ment requires a somewhat more elaborate proof than the weak one does, 
and it is only the weak statement that we shall actually use in this article. 

We should include, in the above list of representability results the rela- 
tionship between the “representability” of the absolute deformation prob- 
lem and the representability of the corresponding collection of “relative” 
problems. Namely, if p is absolutely irreducible and A is the (“universal”) 
coefficient- A-algebra representing the functor Dz (such an RF existing, by 
part (ii) of Proposition 2), then fixing a lifting p of p to a coefficient-A- 
algebra A, we may view the coefficient-A-algebra R as “A-augmented” via 
the homomorphism R — A which classifies p. We have 


Proposition 4. If p is absolutely irreducible and R is the (“universal”) 
coefficient-A-algebra representing the functor D3, then for every lifting 
p:W— GbLy(A) of p to a coefficient-A-algebra A, which satisfies the 
(“minimality”) property that the coefficient-A-algebra A is generated by 
the traces of p, the functor D, : Ca(A) — Sets is prorepresentable by 
the A-augmented coefficient-A-algebra R. That is, for each A-augmented 
coefficient-A-algebra B, there is a natural one-to-one correspondence be- 
tween the set of deformations of p to B and the set of A-augmented coeffi- 
cient-A-algebra homomorphisms from R to B. 


Remarks. We will not give the proof of Proposition 2, which has been 
written up in various places (e.g., [M 1]); item (ii) in its assertion requires 
Schlessinger’s Theorem. More germane to our purposes in these notes, 
really, are Propositions 3 and 4 whose proofs we will give, “independent of 
Proposition 2,” in full detail. Proposition 2 (i) implies that the absolute 
deformation problem for any residual representation p (with II satisfying 
the p-finiteness hypothesis) possesses a pro-representable bull in the sense 
of Schlessinger [Sch]; see the discussion about this given in §21 below. 


Proofs of Propositions 3a-and 3b. Let A be any coefficient-A-algebra, and 
p: Il — GLy(A) a homomorphism. The first step is to show that our 
functor D, satisfies hypothesis (T4). That is, we must show that 


D,(Ale] x4 Alel)  Dp(Alel) x, 4) Dp(Alel) = Dp(Ale]) x D,(Alel) 


is a bijection. That h is surjective is straightforward. We must show 
injectivity of h, which is in fact also straightforward, but here it is. Let 


7,6: 1L— GLy(Ale] x4 Ale]) 


be homomorphisms representing two elements of D,(Ale] x 4 Alfe]) which 
map to the same element under h. Let 1,61 : Il — GLy(Ale]) be the 
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homomorphisms obtained from 7,6 by composing them with the homo- 
morphism 
GLy(Ale] x4 Ale]) > GLy(Ale]) 


induced by projection to the first factor Ale] x4 Ale] — Ale], and similarly 
let Yo, 6g : 1 GLa(Ale]) be the ones induced by projection to the second 
factor. 

The corresponding homomorphism II — GLy(A) obtained by projecting 
all the way to GLy(A) are both equal to p. 

Since h(y) = h(6) there are elements a; € GLyn(Ale]) which “intertwine” 
yi; with 6; for i = 1,2, and which project to the identity in GLy(A). 
It follows that there is an element a € GLy(Ale] x4 Ale]) projecting to 
a, and a2 under the first and second projections respectively, and this a 
“intertwines” -y with 6, showing that ¥ is strictly equivalent to 6, i.e., that 
h is injective. 

To conclude the proof of Propositions 3a and 3b, we must show that the 
Zariski tangent A-module tp, is of finite type for an artinian coefficient-A- 
algebra A, and for any coefficient-A-algebra A, respectively. This will be 
done in Propositions 2a and 2b of the next section. 


Proof of Proposition 4. We shall show that if p is absolutely irreducible, 
then for every A-augmented coefficient-A-algebra B, the natural mapping 


(3) D,(B) — {z € D3(B) | + class of p in D5(A)} 


is a one-to-one correspondence, this statement being equivalent to the state- 
ment of our proposition. First, let us show injectivity of (3), not using 
the “minimality” assumption on A. For this, let 1, p2 :W— GLy(B) 
be two homomorphisms projecting to p after composition with the map 
GLy(B) — GLy(A), and which are assumed to be strictly equivalent rel- 
ative to p. We must show them to be strictly equivalent relative to p. Let 
G@ € GLw(B) be the element that intertwines p) to po, and let a € GLy(A) 
be the projection of @. Since ~ commutes with the image of p, by Schur’s 
Lemma, a is a scalar matrix in A, i.e., a =a- In where a € A* and In is 
the N x N identity matrix. Since a comes by projection from some matrix 
in GLy(B) it follows that the element a € A* is in the image of B, and let 
5b € B be some element which projects to a € A*. Since A is a local ring, 
and B— Ais a mapping of local rings, it follows that b € B*, and we may 
form @’ = b-!- @ € GLy(B) which projects to the identity in GLy(A) 
and intertwines p, to po, showing that p, is indeed strictly equivalent to 
pz relative to p. 

As for surjectivity, we must show that if we are given a homomorphism 
p, : 11 > GLyn(B) which after composition with GLy(B) — GLy(A) is in 
the same strict equivalence class (relative to p) of p in D3(A), then we can 
find a homomorphism p2 : Il — GLy(B) which is strictly equivalent to 
p, relative to 6 and which, under composition with GLy(B) — GLy(A), 
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projects to the homomorphism p itself. For this, just note that by the 
minimality property on A, if D>(B) is non-empty (which we may assume) 
the homomorphism B — A is surjective, and therefore an element a in 
ker{GLn(A) — GLy(k)} intertwining the image of p,; with p may be 
lifted to an element @ € ker{GLy(B) — GLy(k)}. Conjugating p; with 
the inverse of G, one gets the desired po. 


CHAPTER V. ZARISKI TANGENT SPACES AND 
DEFORMATION PROBLEMS SUBJECT TO “CONDITIONS” 


§21. A cohomological interpretation of Zariski tangent A-modules. 
One of the basic tools of deformation theory is the cohomological interpre- 
tation of the Zariski tangent spaces that occur. This allows us at times to 
“control” the somewhat abstract universal deformation rings that occur by 
means of concrete cohomological calculations. 
To describe this, fix II a profinite group satisfying the p-finiteness con- 
dition, 
p:WI- Gin (k) 
a continuous residual representation with k a finite field of characteristic 


p, and A a coefficient-ring with residue field k. For the discussion below, 
we fix a specific homomorphism 


p: t-— GLy(A), 


where A is a coefficient- A-algebra and we consider the deformation problem 
relative to p, i.e., the functor 


D,:Ca(A) = Sets. 


More specifically, we consider its Zariski tangent A-module, tp,,A which 
we shall more simply denote tp. 

The cohomological interpretation we are referring to is as follows. Let 
V be the free A-module of rank N, V = A’, endowed with A-linear II- 
action given via composition of p : II — GLy(A) with the natural action 
of GLy(A) on V. Let End4(V) denote the free A-module (of rank N7) 
consisting of A-linear endomorphisms of V. The action of II on V induces 
an action (the “adjoint action”) of II on End,(V) : the formula being 


(9 e)(v) = p(9)(e(o(g)*(v)) 
where g € II, e € Endg(V), andve V. 


Proposition 1. There is a natural isomorphism of A-modules 


t, = H' (II, Enda(V)). 
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Here, and in what follows, we use ‘continuous’ cohomology. 


Sketch of Proof. Let T := ker{GLy(Ale]) ~ GLy(A)} so that we have a 
short exact sequence of groups 


(4) 1—T > Gin (Ale]) — GLy(A) - 1 
with a natural splitting, coming from the injection 
GLy(A) C GLy(Ale)), 


so that we may view GLyw(Ale]) as a semi-direct product GLy(A) « I. 
Moreover, letting My(A) denote the underlying additive group of the A- 
algebra of N x N matrices with entries in A, there is a natural isomorphism 
of (commutative) groups 


Ee 1+e-My(A) Myn(A) = End,(V). 


(lte-m b> m) 


Using these isomorphisms, one may rewrite GLy(Al[e]) as the semi-direct 

product 
GLy(Ale]) = GLy(A) « My (A), 

where the action of GLy(A) on My(A) is by conjugation, i-e., is the stan- 
dard “adjoint action.” 

Now the set of deformations, D,(Ale]), lifting p to Ale] is the set of 
strict equivalence classes (relative to p) of homomorphisms p’ fitting into 
the diagram 


tt 2 Giw(Ale) & GLA) x End4(V) > GLy(A) 


where the composition above is p. “Strict equivalence” means, of course, 
that we will be considering the “orbit” of the set of such homomorphisms 
p’, under the action of the subgroup [ Cc GL (Ale]) via conjugation. 

But let us first ignore the equivalence relation and study the set of 
homomorphisms p’ which lift p as above. There is a chosen such homomor- 
phism, call it p9: namely the composition of p with the natural imbedding 
GLy(A) C GLy(Afe]). For any other p’, define the difference cocycle 


cy : Il >IT =Endag(V) 


by cy(g) = p'(9)- po(g)~* ET for g € Il. Check first that this construc- 
tion p’ ++ cy provides a bijection between the set of liftings p’ of p, and 
the set Z'(II,End,(V)) of 1-cocycles on II with values in the II-module 
End4(V) = My(A), where the action of II on End,(V) is the “adjoint 
action” as described above. Then check that under this bijection, liftings 
p’ and p” of Dp are strictly equivalent if and only if their associated cocycles 
cy and cy are cohomologous. Finally, one must check that the resulting 
bijection t, = H' (I, End4(V)) is A-linear. 


We are now in a position to prove the following result, thereby concluding 
the proof of Proposition 3a of §20. 
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Proposition 2a (“weak finiteness”). Let A be an artinian coefficient- 
A-algebra. Then the Zariski tangent A-module t, is finite. 


Proof of Proposition 2a. Let A be artinian. It suffices to show that the 
A-module H!(II,End,(V)) is finite. Let II, c II be the kernel of p. Since 
A is artinian, II, is an open subgroup of finite index in II. The A-module 
H'(II,End,(V)) fits into an exact sequence 


HA} (II/Tlo, Enda(V)) — H1(II, End4(V)) — Hom(II, End,(V)), 


the left-most A-module being finite since both I/II, and End,(V) are 
finite, and the right-most A-module being finite since II satisfies the p- 
finiteness condition. 


Remark. In the special case where 7 is absolutely irreducible, with universal 
deformation ring R = R(p), the isomorphisms of the Proposition of §17 and 
of Proposition 1 above give: 


Hom 4(QR/A Qp A, A)©& tp = A’ (I, My(A)). 


In particular, taking A = k and p = #, the Zariski tangent k-vector space 
of the local ring R has the following cohomological interpretation: 


Hom, (mr/(p, mR’), k) = Hi (H, My(k)). 


We now can sketch the proof of the following “strong finiteness” result, 
thereby concluding the proof of Proposition 3b of §20. 


Proposition 2b (“strong finiteness”). Let A be any coefficient-A-alge- 
bra. Then the Zariski tangent A-module t, is finite. 


Proof. The functor D; satisfies Schlessinger’s conditions (H 1,2,3) as given 
in Proposition 2 of §20 above and as proved in [M 1]. By Schlessinger’s 
Theorem (cf. the discussion in §18 above) D3 possesses a pro-representable 
hull (R, €). We have then the smooth morphism of functors 


(which induces an isomorphism on Zariski tangent k-vector spaces). Since 
€(A) : Dr(A) > Dj(A) 


is surjective, we may (and do) choose some element f € Dpr(A) whose 
image under €(A) is p € D(A). Now, referring back to the terminology of 
817, we have the relative Zariski tangent A-modules for our two functors, 
Dr and Dz; (relative to p and p, respectively) the definitions of which we 
now review: 


tpr,p = the subset of Homa-aig(R, Ale]) consisting of those 
A-algebra homomorphisms whose composition 
with the projection A[e] — A is equal to /. 
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tp = tp,» = the subset of D5(Ale]) consisting of those 
A-algebra homomorphisms whose composition 
with the projection Ale] — A is equal to p. 


Since Ale] — A is surjective, and since the morphism € is smooth, € 
induces a surjection h of A-modules 


h: tprp — tp- 
By the proposition of §17, we have a natural isomorphism of A-modules: 
tDp,p = Homa-moa(Qr/a @r A, A) 


and, in particular, since Qp/, is of finite type as R-module, we see that 
tpr,p is of finite type as A-module, and therefore, by noetherianness of A 
and the surjectivity of h, it follows that ¢, is of finite type over A. 


For amusement, consider the corollary of Proposition 2b below. I apolo- 
gize in advance for (a) the clearly too round-about method of the proof of 
this corollary and (b) for not pausing long enough in these notes to prove 
anything stronger than: 


Corollary. (finiteness result for cohomology A-modules of profinite groups 
satisfying the p-finiteness condition): Let A be a coefficient ring with (finite) 
residue field of characteristic p. Let II be a profinite group satisfying the 
p-finiteness condition. Let M be a free A-module of finite rank with a 
continuous A-linear action of II, and let H'(II,M) denote the continuous 
(1-dimensional) cohomology group of II with coefficients in the A[II|-module 
M. We give H'(II, M) its natural A-module structure. Then H1(II,M) is 
of finite type as A-module. 


Proof. Let V be the free A-module M @ A endowed with II action which 
is the direct sum of the given action on M and the trivial action on A. Let 
N — 1 denote the rank of the free A-module M, so that N is the rank of 
V. Let 

p:Il— Aut,g(V) + GLn(A) 


be the associated representation, and let p be its associated residual rep- 
resentation. We view End,4(V) as an AlI]|-module where the action of 
II is the adjoint action, and we let H'(II,End,(V)) denote the continu- 
ous (1-dimensional) cohomology of II with coefficients in the A[IT]-module 
End,(V). Since H1(II, M) is a direct summand of H1(II,End,(V)) and 
since A is noetherian, it suffices to show that H1(II, End,(V)) is of finite 
type as A-module. But by Proposition 1, we have an A-module isomor- 
phism H'(II,End4(V)) = t, and by Proposition 2, t, is of finite type over 
A. 
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§22. The Zariski tangent space of the universal deformation ring 
as “extensions”. Another way of stating the proposition of the previous 
section which is, perhaps, more revealing than the way we have stated it 
and which is particularly useful in some contexts is the following: 


Proposition. Let A be a coeffictent-A-algebra. There are natural isomor- 
phisms of A-modules 


ty © H* (II, End 4(V)) © Ext um(V, V). 


Here, Ext 4mj(V, V) mean Ext’ in the category of profinite A[I|-modules 
with continuous I]-action and we will abbreviate it to the shorter notation: 
Extn(V,V). We will directly compare the two end-modules, 


(5) tp = D,(Ale]) = Exty(V, V). 

Given a deformation V,; of V to Ale], by restricting the ring of scalars 
of Vi from Ale] to A (via the injection 1: A — Ale]) we may view V, 
as free A-module of rank 2N, with an A-linear continuous action of the 
group II. Identifying the A{IT]-modules «- Vi and V\/e- Vi with V (in the 
natural manner) we then see Vj as an extension of V by V in the category 
of profinite A[II|-modules with continuous [-action: 


E:03VSVY4v0 
Sending the element of t, = D,(A[e]) which corresponds to the isomor- 


phism class of V, to the element of Extn (V, V) corresponding to € gives a 
well-defined mapping 


§; D,(Afe]) = Extn (V, VY); 

which is easily seen to be an A-module homomorphism. Going the other 
way is equally direct: given an extension €, one imposes an Al[e]-module 
structure on V, in the evident manner (multiplication by ¢€ is given by 
Ga) enabling us to view Vi as deformation of V to Ale]. This gives the 
isomorphism (5). 

Remark. The second isomorphism displayed in the proposition above can 
also be seen as coming from the “natural” isomorphism 


(6) Hi (IL, End,(V)) & Extn(V, V) 
which arises from the degeneration of the Spectral Sequence 
(7) H? (IL, Ext"(V, V)) ==> Ext®"4(V, V) 


where Ext’? denotes “sheaf-Ext”; that is, Ext? = Ext, meaning “Ext? 
in the category of profinite topological A-modules.” Since V is a free A- 
module, Ext?(V,V) = 0 for q > 0, and since the IJ-module Ext°(V, V) is 
just End(V) with the adjoint Il-action, the Spectral Sequence (7) degen- 
erates, yielding an isomorphism (6). But this is somewhat “learned” and 
one is left with the chore of making precise the identifications involved, if 
one wishes to check that the isomorphisms of the proposition in §21, and 
(5) and (6) are compatible. 
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§23. “Deformation conditions”. It is often the case that one wishes to 
restrict the deformation problem to deformations of p which satisfy certain 
conditions. This will occupy us much in what is to come. In order to 
treat issues that arise here in a uniform, rather than ad hoc, manner it 
is convenient to pause and ask what general form these “conditions” will 
take. Here is a suggestion that covers all conditions that we will encounter 
in this article. 

Fix N, II, and A. Let us consider the category Fy = Fiy(A;II) whose 
objects are pairs (A, V) consisting of an artinian coefficient-A-algebra A, 
and a free A-module V of rank N endowed with an A-linear continu- 
ous I-action. A morphism in Fy from an object (A,V) to an object 
{Ai,V) consists of a pair of morphisms A— A, (of-artinian coefficient- 
A-algebras) and V — V, (of A-modules) inducing an isomorphism of A- 
modules V @,4 A; = V, which is compatible with I]-action in the evident 
sense. Consider, now, the following three conditions on a full sub-category 
DFn C Fy: 

(1) For any morphism (A,V) > (Ai,Vi) in Fry, if (A, V) is an object of 
DF then so is (Aj, Vi). 
(2) Let A,B,C be artinian coefficient-A-algebras fitting into a diagram 


A B 


GS ae - ie 
C 


Consider an object (A x¢ B, V) in Fy and let V4, Vg denote the tensor 
products of V with respect to the natural projections 1 x¢ 6 anda xc l 
from A xc B to A and B, respectively. Then (A xc B,V) is an object in 
DF if and only if both (A, V4) and (B, Vg) are objects of DFy. 

(3) For any morphism (A,V) — (Ai, V1) in Fin, if (Ai, Vi) is an object of 
DFy and A— Ay, is injective, then (A, V) is an object of DFy. 
Definition. Fix A, II, and a continuous residual representation 

p: 1 — Giy(k). 


We denote by V the N-dimensional k-vector space k™ with k-linear II- 
action given by p. By a deformation condition D for p, we mean a full 
subcategory DF'y C F'n satisfying (1)-(3) and containing (k, V) as object. 


Suppose that we are given a deformation condition DFy C Fiy. For a 
coefficient-A-algebra A, and homomorphism 


p:Il— GLy(A) 


lifting p, let V(p) denote the free A-module A% with Il-action given by p. 
If (A, V(p)) is in DF then we will say that p is of type D, and its strict 
equivalence class is a D-deformation of type p. Define a subfunctor 


D, C Dz: Ca(A) — Sets 
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by the following rule: 

For any artinian A-augmented coefficient-A-algebra B, and element € € 
D,(B) representing a strict equivalence class of liftings p; relative to p to 
B, then € is in the subset D,(B) C D,(B) if and only if p: is of type D. 
Extend the functor D, to Ck (A) by continuity; i-e., for B any A-augmented 
coefficient- A-algebra, let D,(B) C D, denote the subset, 


proj. lim. D,(B,) C proj. lim. D,( Bn). 


Proposition. If D is a deformation condition for p, and p: II > GLn(A) 
is a lifting of p, then the subfunctor D, C D, 1s relatively representable. 


Proof. This is immediate from conditions (1) and (2) alone. 


Corollary. If D is a deformation condition for p, then the functor Dz (on 
Ca) satisfies (Hi), (H2), (H3). The functor Dz has a “pro-representable 
hull” in the sense of Schlessinger (cf. the discussion of §18 above). For p 
any lifting of p the functor D, is nearly representable. If p is absolutely 
irreducible, then Dz is pro-representable (by a quotient ring of the ring 
pro-representing D5). 

Proof. This is straightforward from the relevant definitions coupled with 


the discussion of §18, the Exercise and Lemma in §19, and Proposition 2 
of §20. 


Fix a lifting p of p to a coefficient-A-algebra A which is of type D. Since 
D, is nearly representable, we may speak of the tangent A-module to D, 
which is a sub-A-module of ¢, (necessarily of finite type) and we denote it 

tD,p Cc to- 


If p is absolutely irreducible and R = R(p) is the universal deforma- 
tion ring for deformations of p to coefficient A-algebras, and if Rp is the 
quotient-ring of R which represents the subfunctor D then (by §17) we have 
tbe following commutative diagram of A-modules in which vertical maps 
are isomorphisms 


tp, p Cc to 


| | 


Homa (Qr_/a @rRp A, A) Cc Homa, (QR QR A, A). 
We may also quote the Proposition of §21 which identifies the A-modules 
t, and H*(II,End,(V)) for V = V(p). Therefore the sub-A-module tp,, 
of t, is identified with some sub-A-module of H'(II,End,4(V)). Let us call 
this sub-module Hp'(II,End,(V)) so that we have a commutative square, 


tD,p CG ty 


| “| 


Homp(II,End4(V)) Cc H1(II,End,(V)). 
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But of course this notation, Hp'(II,End,(V)), is nothing more than a 
“promissory note” that our theory be eventually required to deal with: in 
any concrete instance of a deformation condition D, it will be our chore to 
describe, in group-cohomologial terms if possible, the sub- A-module 


H} (UL, End,(V)) c H'(,End,(V)). 


In the next two sections we will consider examples of this. 


§24. Determinant conditions. Keep to our standard notation, so A is 
a coefficient-A-algebra and V(p) = V = AN the free A-module of rank N 
with II-action induced by p. 

Let 6 : II — A* be a continuous homomorphism. Say that V has “de- 
terminant 6” if the action of II on the N-fold wedge product A‘ (V) of 
the A-module V is given via the continuous homomorphism 


jae Ne As 


where the homomorphism A* — A* comes from the A-algebra structure of 
A. Explicitly, g(w) = 64(9)-w forg € I, andweé AN (V). Let us keep 
the quotation-marks around the phrase “determinant 6” to remind us that 
the actual determinant of our representation takes values in a different ring 
than 6 does. If p’ is a deformation of p and p’ has “determinant 6” then so 
does p. 

Let p be the underlying residual representation attached to p, so p also 
has “determinant 6.” Consider the property D which is defined by im- 
posing the condition on a homomorphism p : II — GLy(A) that it be of 
“determinant 6.” 


Proposition. The condition D (of being of “determinant 6”) is a “defor- 
mation condition.” If p: Il > GLy(A) is a continuous homomorphism of 
“determinant 6” and V = V(p), we have 


Hd (I, End,(V)) = H1(T, End§(V)) Cc A* (I, End,4(V)) 


where End4(V) C Enda(V) is the sub A-module of “traceless” endomor- 
phisms (endomorphisms whose trace is zero). The sub-A-module End4(V) 
is stable under the action of II, the cohomology group H'(II, End4(V)) 
being computed with respect to this action. 


Proof. The first sentence in our proposition is straightforward to show. The 
second requires going back to the proof of the Proposition of §21. Consider 
the commutative diagram 


lte-My(A) —2+ My(A) 


act | Trace | 


1+e-A ae ee A 
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where the mappings labeled 7 are the natural isomorphisms of groups given 
by j(l+¢-z)=c. 

Let pp : II — GLy(Ale]) be the composition of p with the natural 
injection induced by 2: A — Ale]. 

For any lifting p’ of p to Afe], and any g € II, note that p’(g) - po(g)—? 
lies in the subgroup 1+¢-My(A) C GLy(Afe]). Now recall the “difference 
cocycle” constructed in the proof of the proposition of §21, i-e., 


cy : I — My(A) 


given in our present notation by cy(g) = j(p’(g) - po(g)~*). By the com- 
mutativity of diagram (6) we have 


Trace(cy(g)) = j(det(p’(g)) - det(p(g))~*) 


and, in particular, the image of c, lies in My(A)° if and only if p’ has 
“determinant 6.” 
In the special case of an absolutely irreducible residual representation 
p of “determinant 6,” if R = R(p) is the universal A-algebra deformation 
ring for p, and 
p's : Tl — GLy(R) 


the universal deformation, one can describe explicitly Rp, the quotient A- 
algebra of R universal for deformations of p which are of “determinant 6,” 
in the following elementary way: Let 


guniv - TI] —» R* 


denote the determinant of p"”, and let dp denote the composition of our 
character 6 with the natural homomorphism u : A* — R* which comes 
from the A-algebra structure on R: 


OR =Th A A* = R*. 
Then Rp = R/I where I is the ideal in R generated by the elements 
br(g)—- Og) E R 


for all g € II (equivalently, for a system of topological generators g of II). 


§25. Categorical conditions (Ramakrishna’s Theory cf. [Ram]). 
In this section we fix A a coefficient-ring, and II a profinite group. Let 
Rep, (II) denote the category of A-modules of finite length which are en- 
dowed with continuous, A-linear action of II. Let P be a full subcategory 
of Rep, (II) which is closed under passage to subobjects, quotients, and di- 
rect sums. The following proposition shows that any such full subcategory 
determines “deformation conditions”: 
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Proposition 1. Let P be a full subcategory of Rep,(II) closed under sub- 
objects, quotients, and direct sums. For N any positive integer let PFy 
denote the full subcategory of the category Fy (as in §23) comprising those 
oojects (A,V) such that V viewed as A-linear representation of II, kes in 
P. Then PF is a “deformation condition” as defined in §23 above. 


Proof. We must show that P Fy satisfies the two conditions 


(1) For any morphism (A,V) — (A1,V) in Fy, if V is an object of 
P Cc Rep,g(II) then so is Vj. 

(2) If A, B, and C are in Ca, anda: A-—C and 6: B -C are 
morphisms in Ca, with A xc B the associated fiber product, then 
for any object (A xc B,V) in Fy, let V4, Vp denote the tensor 
products of V with respect to the natural projections from Axo B 
to A and B, respectively. Then V is in P if and only if V4 and Vg 
are both in P. 


But (1) holds since P is closed under passage to direct sums and quo- 
tients and (2) holds because P is closed under passage to subobjects, direct 
sums, and quotients. 


Remark. In practice, the example of particular interest to us (and the 
principal application in [Ram]) is in the case where II = Gos withp Ee S 
and the condition P on our representations is the condition of being finite 
flat at p. See the brief discussion of this condition for I = Gg,, with 
£ A 2, in §31 below. Explicitly, let. P = Pm be the full subcategory of 
Rep(Gqg,) whose objects are the Gag,-representations whose representation 
spaces are isomorphic to the Gg,-Galois modules which can be given as the 
generic fibers of finite flat group schemes over Spec(Zz). Then Pg is closed 
under passage to subobjects, quotients, and direct sums, and consequently 
gives rise to a “deformation condition.” 

Fixing a full subcategory P (closed under sub-objects, quotient-objects, 
and direct sums) of Rep,(II), by the above proposition, we have that P 
determines deformation conditions for each positive integer N. Therefore, 
whenever we are given a coefficient-A-algebra A and a continuous homo- 
morphism p : II — GLy(A) such that V = V(p) is in Rep,(II), the 
deformation problem D, (obtained by restricting to deformations whose 
finite-length quotients lie in P) is nearly representable, and has a tangent 
A-module 

to) = Hp (Il, End(V)) c H' (I, End(Vv)). 


In terms of the description of H1(II,End(V)) as an “Ext-group” given 
in §22, we may give a direct categorical description of the sub-A-module 
HA (I, End(V)), as follows. 

If V,V’ are (continuous) A[I|-modules of finite length (as A-modules) 
which, as A[II]-modules, are elements in P, the subset 


Ext aqm,p(V, V’) C Extapm(V, V’) 
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of extensions 
E:03V’ SE—V—0 


where € is an A[II|-module which, as A[II|-modules, is a member of P 
is easily seen to be a sub-A-module of Extyimj(V,V’): the sum of two 
elements, 

€,:0--+V’ —~&; -V—0 


Bet) V) Sioa Vv SD 


in Ext,[II](V, V’) is obtained by restricting the direct sum €, @E2 to the di- 
agonal in V@V and then passing to the quotient by the diagonal in V’ @ VV’. 
Since P is closed under direct sum, sub-object, and quotient, if €1, E2 are 
in P, then the sum of the extensions €), €2 remains in Extajm,p(V’,V), 
which is also closed under scalar multiplication by A. 


Proposition 2. Let V,V’ be (continuous) A[II]-modules of finite length (as 
A-modules). We have the commutative diagram where the vertical maps are 
isomorphisms 


tD,p C to 


Ext apm,p(V, Voy ee Ext ain (V, V’). 


Proof. After the Proposition of §22 and the above discussion, this is evi- 
dent. 


CHAPTER VI. BACK TO GALOIS REPRESENTATIONS 


§26. Galois deformation conditions. Fix a coefficient-ring A of resid- 
ual characteristic p, and positive integer N. Let K be a number field and 
S a finite set of (non-archimedean) primes of K which contain all primes 
of residual characteristic p. By a Global Galois deformation problem 
(for A, N, K, 5 as above) let us mean that for each prime A € S we specify 
a “local deformation problem,” i.e., a deformation condition in the sense 
of §23 for the decomposition group II = Gx,. 

More explicitly, we must specify (for each A € S) a full subcategory 
D)Fy(A;Gx,) in the category Fy(A;Gx,) which satisfies properties (1) 
and (2) of §23. 

Attached to such data, we may define a full subcategory DF'w(A; Gx,s) 
of Fy(A;Gx,s) by restricting to objects (A, V) of F(A; Gx,gs) such that 
when, for each A € S, they are considered objects of F(A; Gx, ) they lie 
in the subcategory D)Fiy(A; Gr, ). 

By a Global Galois deformation problem with fixed determinant 
we mean a slight variant of the above. Namely, for a character 


6: Gk,s —_ A 


we let DFw(A; Gx,s;6) denote the full subcategory of DFw(A; Gx,s) com- 
prised of objects (A, V) of “determinant 6.” 
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Proposition 1. Given a Global Galois deformation problem, the full sub- 
category DFn(A;Gx,s) C F(A; Gx,s) satisfies conditions (1) and (2) of 
§23. It defines a deformation condition for Gx,s. The same is true, given 
a Global Galois deformation problem with fized determinant 6, for the full 
subcategory DF'y(A; Gx,s;6) C F(A; Gx,s; 6). 

Proof. A straightforward check of the definitions involved. 


It follows that Global Galois deformation problems, with or without 
fixed determinant, give us “deformation conditions” and we may ask for a 
description of their associated Zariski tangent A-modules. 


Proposition 2. Let D be the deformation condition attached to a Global 
deformation problem for Gx,s and for each X € S let Dy be the correspond- 
ing local deformation problem. Let A be a coefficient-A-algebra, and 


p: Gxr,s —_ GLy (A) 


a continuous homomorphism. Let V = V(p). 
The Zariski tangent A-module tp,, fits into a cartesian square of A- 
modules 


tD,p ns H'(Gx,s, End(V)) 


(7) | | 


@ Hp, (Gx,,End(V)) ——> @ H1(Gx,, End(V)). 
ES AES 


where the horizontal homomorphisms are the natural injections coming 
from the Proposition and surrounding discussion of §21, and the right-hand 
vertical homomorphism is induced from restriction of cohomology. 


Proof. This again comes from a straightforward reduction of the defini- 
tions involved, together with the Proposition in §21 and its surrounding 
discussion. 


If D is the deformation condition attached to a Global deformation prob- 
lem with fixed determinant for Gx,s then we have a very similar diagram 
as (7), but with one minor change, namely: 


Proposition 3. We have a cartesian diagram 


tD,p —_—— H'(Gx,s, End(V)°) 


®) | | 


@B Hy, (Gx, ,End(V)) —— @ H'(Gx,,End(V)). 
AES AES 


The “cartesian-ness” of diagram (7) is nothing more than the follow- 
ing “Selmer-like” description of the sub-A-module H4(Gx,s,End(V)) in 
H'(Gx,s,End(V)). Namely, 
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Proposition 2’. Under the hypotheses of Proposition 2, 
Hp(Gx,s, End(V)) 


consists of those cohomology classes in H4(Gx,s,End(V)) which, when 
restricted to Gx,, land in the submodule 


Hp, (Gx,,End(V)) Cc H'(Gx,, End(V)) 


for eachA€e S. 


There is a similar paraphrase of Proposition 3. 

In the special case when p is absolutely irreducible, the deformation 
problem for D is then representable and we have its corresponding universal 
deformation ring (A-algebra) Rp. We may then use the proposition of 
817 to give us an alternate description of tp, p in terms of the module of 
Kahler differentials of the A-algebra Rp. We recall this explicitly in the 
next proposition. 


Proposition 4. Under the hypotheses of Proposition 1, assume further 
that the residual representation p, of which p is a lifting, is absolutely irre- 
ducible. We have natural A-module isomorphisms 


Homa(2R,/. QR» A, A) =to»= Hp(Gxs, End4(V)). 


We have a similar statement under the hypotheses of Proposition 3. 
Proposition 4 has a number of slightly variant forms which come up and 
we give two of these in the next sections. 


§27. Passage to the limit. Now let A be a coefficient-A-algebra which 
is p-torsion-free. Let p: Gx,g — GLy(A) be a continuous homomorphism 
whose associated residual representation is absolutely irreducible. Let 


Pn: Gx,s + GLy(A/p"A) 


be the “reduction mod p”” of p. Denote A/p” A by An, for short. Assume 
that we have a Global Galois deformation problem D, as in Proposition 4 of 
the previous section, where the p,,’s are all of type D. There is a completely 
analogous treatment of the case of Global Galois deformation problems with 
fixed determinant (obtained by replacing End(V)’s by End°(V)’s) which 
we leave to the reader. By Proposition 4, we obtain commutative diagrams 
with the horizontal mappings isomorphisms, and the vertical mappings 
inclusions, 


II 


Homa (QR, /, QR» A, An) Hd4(Gx,s, Enda, (Vn)) 


| < | 


Homa(Qra Qr A, An) = H'(Gx,s, Enda, (Vn)). 
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for each n. 
Now consider the directed system of A-modules, 


pied SA ag See: 
where 7, is the mapping induced from multiplication by p viewed as endo- 


morphism of the A-module A,41. The mappings 7, are injections because 
A is p-torsion-free. Consider also the directed system of Gx,s-modules 


3-2 Rady. (Va) 25 Bnd y4.(Vaaq ene 
where the 7,, associates to an endomorphism 
Yn: Van — Va 
the endomorphism y,p+1 which is the composition 
Vn+t — Va <3 Va > Vasi 


the unlabeled mapping being the natural projection. The direct limit of 
these systems are given, respectively, by 


lim An = A@z, Qp/Zp 


and 
lim End4, (V,) = Enda(V) @z, Q,/Zp. 
We have 


Proposition. The diagram 


II2 


Homa(QpRa @r A, An) H'(Gx,s, Enda, (V;.)) 


| | 


Homa (Qra Qp A, Ani1) = H"(Gx,s, Enda, ,,(Va41))- 


is commutative for each n, where the horizontal isomorphisms are those 
provided by Proposition 4 of §26. 


Proof. One may, of course, just grind this out. But I find it useful, in 
thinking about this compatibility, to introduce a subring of A,+1[e], inter- 
mediate between A,[e] and A,41[e] (call it An+i[e]’) given by 


An+ilel’ = Anti ® €-pAnsi = Anyi @E- An. 
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We have a little diagram of A-algebras 


An+ile!’ ———> Anle] ——— 0 


| 


An+1(e]- 
Now if 
Pn: Gr — GLy(Anle]) = GLly(An) x {1+¢€-My(An)} 
2 SS (Pn(x), ¥(2)) 


represents some element in ¢,,, we have a natural lifting, call it p),, of pn 
to A,41[e]’ defined in terms of the semi-direct product decomposition 


GLy(An+ile]) = Gln (Angi) {1 +¢€-My(An)} 


by the formula 6), (r) = (Pn41(z), yz)) for zc € Gris. 

The imbedding of An4ife]’ in Anise] allows us to view fj, as element, 
now, of t,,,,- This defines a mapping of A-modules, kp : ty, png, and 
a somewhat more direct check, using the definitions involved, shows that 
each of the two diagrams, 


Hom,(2R7a @r A, An) = t 


Lo | 


Hom4(Qp/a @r A, An+1) 


at 


| 
oh 


and 


he 


me =  AMGes; Enda, (Va) 


(10) kn | jn | 


t = H'(Gx,s, Enda, ,,(Vn41))- 


Pn+1 
are commutative. 

Since cohomology commutes with direct limits (as our modules are dis- 
crete Galois modules) we have that H!(Gx,5,End4(V) @z, Q,/Zp) is the 
direct limit of the right-hand vertical direct system in (10) and we denote 
by 


Hp(Gx,s, Enda(V) @z, Qp/Zp) C H*(Gx,s,End4(V) @z, Qp/Zp) 
the direct limit of the sub-system 
A4(Gx,s, End,, (Vn)) Cc H'(Gx,s, Enda, (Vn)) 


| za 


H4(Gx,s, End, Vieia)) Cc H"(Gx,s,Enda,,, (Vn41))- 


n+l ( 
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§28. The special case when A = A. Return to the situation of Propo- 
sition 4 of §26, and let 


p:Gx,g > GLy(A) 


denote a lifting of 6, an absolutely irreducible residual representation. As- 
sume that p is of type D (for D a given Global Galois deformation problem). 
From Proposition 4 of §26, we have isomorphisms 


Homa (QR, /a @r, A, A/J) = H'p(Gxs, Enda(V) @ A/J), 
for any ideal J C A (by taking A = A/J). 
The classifying mapping Rp — A for the deformation p of p to A, being 


a homomorphism of A-algebras, is surjective. Denote its kernel by J, so 
that we have the canonical splitting as Rp-modules, 


(13) Rp =AQOl_, 
and there is a natural isomorphism of A-modules 
I/T? = Op, /A @Ry A. 
We leave the proof of this as an exercise to the reader with the hint 
that the universal property of Kahler differentials may be the easiest way 


to see this. We then may paraphrase Proposition 4 of §26 as giving the 
isomorphism of A-modules 


Homa (I/I?, A/J) = H(Gx,s, Enda(V) @ A/J) 
for any ideal J C A, and we may similarly paraphrase Corollary 1 of §27 


as giving, under the condition that A is p-torsion-free, the isomorphism of 
A-modules 


Ha(I/I?, A @z, Qp/Zp) * Hp(Gx,s,Enda(V) @z, Qp/Zp)- 
§29. A bestiary of local Galois deformation conditions. Let K) be 
a local field of characteristic 2. It remains to describe some useful local 
deformation conditions D,. For each of these conditions, we must (a) 


check that they are indeed “deformation conditions” in the technical sense 
described above and (b) describe the sub A-modules 


Hp, (Gx,,Enda(V)) C H'(Gx,,Enda(V)) 


concretely. We devote this section to the study of 
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Minimal ramification conditions for + p (in degree 2). 
Let 9: Gx, — GLo(k) be a residual representation, with k a finite field of 
characteristic p # &. 

Suppose either (A) or (B) below. 
(A) The image of the inertia subgroup J C Gx, under f is nontrivial, and 
is contained in a subgroup of GLo(k) which is conjugate to 


(0 3), 


In this case say that a deformation p of p to a coefficient-ring A with residue 
field k is minimally ramified if the image under p of J is contained in a 
subgroup of GLe(A) which is conjugate to 


(0 i), 


In this case, the action of the inertia group J factors through the “tame 
quotient,” and more specifically, through the pro-p-completion of I, which 
is a free pro-p-group on one generator; cf. [Se3], §1. 


Remark. To say that p is minimally ramified in the sense of (A) is equiva- 
lent to saying that we may choose a suitable topological generator y of the 
pro-p-completion of J and we may modify p within its strict deformation 
class relative to p so that the restriction of p to the element ¥ is given by 


the matrix 
Ae ah 
0 1)° 
Next, suppose that: 


(B) The image of J under # is nonzero, and is contained in a subgroup of 
GL2(k) which is conjugate to 

« 0 

0 1)° 


In this case, say that a deformation p of p to a coefficient-ring A with 
residue field k is minimally ramified if the image under p of I is finite, 
and of order prime to p (equivalently: if p(J) if finite and has the same 
order as p(J)). 


Note. Except for the conditions on the determinant, these conditions are 
the types A and B respectively on p. 458 of [W]. 


Proposition. Fiz A a coefficient-ring with residue field k a finite field of 
characteristic p # £. For A any coefficient-A-algebra, let D be the condition 
on a representation 

p: Gx, — GL2(A) 
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that its associated residual representation 
p: Gr, — GLo(k) 


be minimally ramified, either in the sense of (A) or of (B) above, and 
that p itself also be (correspondingly) minimally ramified. Then D is a 
deformation condition, and the sub-A-module 


Hp(Gx,,End,4(V)) C H'(Gx,, End,(V)) 


is given by 
H'(Gx,,End,(V)‘) C H'(Gx,, End,(V)) 


where ky is the residue field of Ky, where superscript I means the sub A- 
module of elements fixed by I, and where the inclusion above is the natural 
one. 


Proof. Let us first show that, in either of the two cases (A) or (B) tke 
condition of being minimally ramified is a “deformation condition” in the 
sense that it satisfies Properties 1-3 of §23. This is immediate in case (B); 
so assume that we are in case (A). The following Lemma will be useful. 


Lemma 1. Let A be a commutative noetherian local ring with residue 
field k. Let V be a free A-module of rank two over A, andy: V— V an 
A-linear homomorphism. These conditions are equivalent. 

(i) There is an A-basts of V with respect to which the action of y is given 


by the matriz: 
1 
0 1|° 


(ii) The tmage of y—1: V — V its equal to ker(y — 1), and is a free 
A-module of rank 1, sitting as a direct-summand in V. 
(iii) We have (y — 1)? =0, and 7 @k is not the identity in V @, k. 


Proof. (i) implies (ii) which implies (iii). Now assume (iii), and let v = 
y—1, so that v? = 0 on V. Consider the mapping v : V — V which we 
break up as 


(1) Vrovu-VCVi)cYV. 


Since vy @4k:V @4k — V @g k is nonzero, and since, by (1) vy @4k 
factors through V[v] @4k — V @4 k, we have that the inclusion V[v] Cc V 
induces a nontrivial homomorphism V[v] @4 k — V @, k. Therefore we 
can find an element z in V[y] C V which reduces nontrivially in V @, k. 
By Nakayama’s lemma (and the fact that V is free of rank 2 over A) we 
can find a element y in V such that z, y is a free A-basis for V. The matrix 
for y in terms of the basis z, y is then 


| 
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where u-v = 0,v? = 0, and u and v are not both in the maximal ideal of 
A. It follows that u is a unit, and consequently v = 0. Changing z to u-z, 
we have found a basis for which the matrix for + is: 


0 4] 


Lemma 1 is helpful in showing that the condition D of being “minimally 
ramified” is a deformation condition in the sense that it satisfies properties 
1-3 of §23. First, by the remark just after the definition of case (A) it 
is clear that the condition of minimal ramification is functorial in couples 
(A, V), i-e., that Property 1 holds. 


Proof of Property 2. Let A,B,C be artinian coefficient-A-algebras with 
coefficient-A-algebra homomorphisms a: A — C’, and 6: B—C. Let p 
be a lifting of to A xc B. 

Clearly, if p satisfies condition (A) then so do its projections to A and 
to B. We must show the converse; ie., that p satisfies condition (A) 
if its projections to A and B do. This boils down to showing that if an 
endomorphism +7 (of a free rank two Ax ¢ B-module) satisfies the equivalent 
conditions of Lemma 1 when tensored with A and with B, then it satisfies 
those conditions even before such tensoring. But this is evident when you 
choose to look at condition (iii) of Lemma 1. 


Proof of Property 3. We must show that if Ag — A is an injective ho- 
momorphism of coefficient-A-algebras and if pg is a deformation of p to 
Ag which “becomes” minimally ramified when you extend its scalars to A, 
then pp is already minimally ramified. 

Let Hp be Ag x Ag viewed as Ap-module with Gx, -action via pp and 
H = Axa, Ho with its induced Gx,-action. Let yo be the Ap-linear 
endomorphism of Hp obtained by the action of a topological generator of 
the pro-p-completion of the inertia group J C Gx, and let 7 be the induced 
A-linear endomorphism of H. By assumption, we have that 7y satisfies the 
three equivalent conditions of Lemma 1, and we choose to concentrate, 
again, on condition (iii). It follows that yo also satisfies this condition 
because H @k = Hy) @k. 


We now must establish the last assertion of our proposition. Let p be 
a minimally ramified lifting of p to the coefficient-A-algebra A, and denote 
by $9 the canonical lifting of p to Afe] (i.e., p9 is the composition of p 
with the homomorphism GL2(A) — GLoe(Ale]) induced from the natural 
injection A — Afe]). For any lifting p’ of p to Afe], we must show that 
p’ is minimally ramified if and only if the “difference cocycle” c, of §18 
determines a cohomology class in the sub-A-module 


H (Gy, ,Enda(V)') C H'(Gx,,End,(V)), 
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or equivalently, if and only if the restriction of Cs to the inertia subgroup 
I is cohomologous to zero. We do this in detail in case (A), the argument 
being more direct in case (B). Assume, then, that p is minimally ramified 
in the sense of (A) above. 


Lemma 2. p’ is minimally ramified if and only if pg and p’, when restricted 
to I, are in the same strict equivalence class relative to p. 


Proof. Let yo and y' denote the image of a topological generator of the 
pro-p-completion of I in GL2(A[e]) under the homomorphisms pp and p’ 
respectively. Our assumptions give us that, after a suitable choice of topo- 
logical generator and basis of A x A we may assume that yo is the matrix 


lo 3) 


that 7/ = yo +¢€-m for a matrix m € Mo(A), and that (y/— 1)? =0. These 
conditions give us, by straightforward calculation, that the matrix m is of 
the following form 
_|a@ 5b 
my 


and therefore 7’ is the conjugate of yg by the matrix 1 + €-n where 
_ |b 0 
m=! ol: 


§30. The condition of being “ordinary.” We keep to the local context 
and degree two representations, i.e., as in the previous section, we shall be 
considering residual representations 


Pp: Gr, ed GLo(k) 


where K) is a local field, but now we suppose that @, the residual charac- 
teristic of K is equal to p, the characteristic of the finite field k. As before, 
denote by J the inertia subgroup of G'x,. 

We say that p is “ordinary and ramified” if it is equivalent to a 
representation of the form 


ae oe x0)) 


where ¥, is an unramified character and ¥, is ramified. 


Note. The condition of “ordinary-ness” found in [W] p. 457 also allows 
representations such as above, but which are unramified; but in that case, 
the deformation problem is posed differently. 
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Say that a deformation p of an ordinary and ramified residual represen- 
tation p to a coefficient-A-algebra A is an ordinary deformation if it is 
equivalent to a representation of the form 


sis ei Bo) 


for characters x, and yo: Gx, — A*, with x, unramified. 

Our discussion of “ordinariness” will use almost nothing concerning the 
nature of the local Galois group Gx, and its inertia subgroup (beyond the 
fact that the inertia subgroup is normal), and in fact will also apply mutatis 
mutandis te the case (B) treated in the previous section. To emphasize 
this, let G be any profinite group, and J C G any closed, normal subgroup. 


Proposition 1. Let p : G — Glo(A) be a representation. These are 
equivalent: 


(i) The representation p is equivalent to a representation of the form 


for characters x, and xo : G — A* such that x, trivial on I and the 
residual character X_:G— k* is nontrivial on I. 
(ii) The representation A-module V(p) = V admits an I-stable filtration 


0-Vi-VawWw-—0 


which splits noncanonically as a sequence of A-modules, where V! is the 
A-module of I-invariant elements in V, where both V! and W are free A- 
modules of rank 1, and finally where the I-representation space W = W@,k 
is not the trivial representation. 

(iii) The sub-A-module V! C V of I-invariant elements is free over A of 
rank 1; the natural mapping V! — V! is surjective, where V = V @, k, 
and V! denotes the k-vector space of I-invariant elements in V; the action 
of I on the quotient W = V/V! is not trivial. 


Note. If these conditions hold we will say that the representation is I- 
ordinary, or if J is understood, we say that it is ordinary. 


Proof. It is evident that (i) implies (ii) and that (ii) implies (iii). It suffices, 
then, to show that (iii) implies (i). Suppose (iii). Let a be an A-basis of 
the (free, rank one) A-module V/ and let @ be any element in V whose 
projection to V does not lie in V’. Let W = A@ A denote the free A- 
module of rank two, and consider the homomorphism y : W — V which 
sends (z, y) to z-a+y-. Since y projects surjectively to V, by Nakayama’s 
Lemma, ¢ is surjective. Since V is free over A, the A-homomorphism 
admits a right-inverse yy: V — W, giving that W = V @ ker(y). But since 
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W is free of rank two over A, counting dimensions of W @, k gives us that 
ker(y~) ®, k vanishes, and therefore ker(y), also vanishes, by Nakayama’s 
Lemma. It follows that y is an isomorphism. Writing p in terms of our 
choice of basis W = A@A we have that the image of I under p is contained 
in the semi-Borel subgroup 
1 x 
(0°) 


Since I is normal in G, the image of p stabilizes V! and therefore p is 
ordinary in the sense defined above. 


Remark. It follows (e.g., from (ii)) that if p is ordinary and V = V(p), then 
no nontrivial A{{|-subquotient-of-the (free, rank-one) A-module W = V/V/ 
has trivial J-action. 

Given an J-ordinary representation p : G — GLo(A) we define 


End(V)ora C End(V) = End,(V) 
to be the rank two free sub A-module 
Hom,(V/V",V) C Homa(V, V) = Enda(V). 


Since the action of G on V stabilizes V’ the sub-module End(V)org C 
End(V) is stabilized under the adjoint action of G on End(V). 


Proposition 2. The homomorphism 
(14) H}(G, End(V)ora) ~ H'(G, End(V)) 


induced by the inclusion End(V)orq C End(V) of A[G]-modules is an in- 
jection. 


Proof. Injectivity of (14) comes from consideration of the long exact se- 
quence of G-cohomology coming from the exact sequence 


0 End4(V)ora + Enda(V) % Homy(V2, V) > 0 


once we show that the homomorphism j induces a surjection on G-invariant 
elements. But the A-module of G-invariant elements in Hom,(V/,V) is 
generated by the natural injection V’ — V and j maps the identity in 
End(V) to this natural injection. 


Definition. The “J-ordinary cohomology group,” 
Hia(G,End4(V)) Cc H'(G,End,4(V)) 
is the image of the injection (14). Thus 


Ho.a(G, Enda(V)) © Ho.a(G, End(V ora) 
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Proposition 3. Fix a profinite group G and a closed normal subgroup 
ICG. If D is the class of I-ordinary representations of G, then D is a 
“deformation condition” in the sense of §23, and if p : G — GlLo(A) is 
ordinary, then the sub-A-module 


Hi(G,End,(V) Cc H'(G,Enda4(V)) 
is given by the I-ordinary cohomology submodule, 


H}.a(G, End(V)) c H1(G, End,(V)) 


Qo 


Proof. Fixing the normal subgroup J, we shall refer to J-ordinary represen- 
tations as simply “ordinary.” We first must show that D satisfies Properties 
1-3 of §23. Property 1 is evident from the definition of ordinariness. For 
the remaining two properties, it is convenient to put subscripts on our 
notation to indicate the coefficient-A-algebra we are dealing with: so if 


pa:G— GIo(A) 


is a representation and V4 := V(pa) its representation space, andif A —C 
is a homomorphism of coefficient-A-algebras, let pc, Vc denote the induced 
representation and representation space. In particular, Ve = V4@4C. We 
note that in this circumstance, we have: 


Lemma 1. 
(a) Let pa be ordinary, and A — C a homomorphism of coefficient-A- 
algebras. The natural homomorphism of C-modules 


(15) (Va) @4C— (Vc)! 


is an isomorphism. 
(b) Jf we are given a diagram 


A B 


a SL 
C 


in Ca and an ordinary representation p : G — GLo(A @c B) with repre- 
sentation space V, we have natural isomorphisms 


(16) V 2 V4 @ve VB 
and 
(17) VI & (Va)! X(yeyt (Va)’. 
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Proof. (a) This follows directly from consideration of the (noncanonically) 
split short exact sequence of A-modules occurring in condition (ii) of Propo- 
sition 1. 

(b) This is immediate. 


Proof of Property 2. If p4 and pg are ordinary, then (V4)! and (Vg)! are 
free of rank one over A and B respectively, and therefore by part (a) of 
Lemma 1, (Vc)! is also free of rank one over C. Moreover, the mappings 
(Va)? — (Vc)! and (Vg)! — (Vc)! come by tensoring (V4)! and (Vg)! 
with a: A—C and 8: B—C respectively. By (17) we then have that. 
V! is free of rank one over A xc B, and by (15) we have that the natural 
mapping V! @k — V! is an isomorphism (the tensor product with k being 
taken over A Xc B). Therefore V satisfies condition (iii) in Proposition 1, 
and p: G — GLo(A Xc B) is ordinary. 


Proof of Property 3. Now suppose given a representation p4 : G — GLo(A) 
and an injection of coefficient-A-algebras 7 : A — C such that pc is ordi- 
nary. 

We must show that p, is ordinary. We use the notational conventions 
Va and Ve for the corresponding representation spaces, and view V4 as 
sub-A[I]-module of Vc. We have that Wo := Vo/(Vc)! is a free C-module 
of rank one, and its I-action is via a character y : I — C* which takes 
its values in A* since x may be identified with the determinant charac- 
ter of p,. We have the further information that the residual character 
¥:I—k* is nontrivial. Since z is an injection of coefficient-A-algebras, 7 
induces an isomorphism of residue fields of A and C, and gives us a natural 
identification V4 = Vo. 

Since 

(Va)? =V4N (Vc)? 


the natural mapping 
(18) Wa = Va/(Va)’ > We = Ve/(Vec)’ 


is injective. We have the exact sequence 


(19) (Va)! @4k V4 Wa @ak 0. 


From injectivity of (18) and the fact that W, does not vanish, the action 
of I on Ws, is via the character x. In particular, since ¥ is nontrivial, no 
nontrivial subquotient of W4@, k has trivial I-action. Since V4 = Vc, we 
then see from exactness of (19) that the k-dimension of W,4 @, k is equal 
to one. Now let us compare (19) with the corresponding exact sequence 
for C: 


(20) 0— (Vc) @ck — Vo > We @ck 0. 


308 B. Mazur 


We get a commutative diagram of k-vector spaces 


Va —— Wa@ak —— 0 


|} 
Vo —— We®ck —— 0 


from which we see that h is surjective (hence also injective since both 
domain and range are k-vector spaces of dimension 1). It follows from 
Nakayama’s Lemma that W4 @4 C — Wc is surjective (hence is an iso- 
morphism since Wy is a cyclic A-module, and Wg is free as a C-module). 
It follows that the annihilator ideal of Wy, in A is trivial. Therefore the 
cyclic A-module W, is free over A of rank 1. We then get that the exact 
sequence of A{I]-modules 


0 (Va)’ = V4 — Wa > 0 


splits (noncanonically) as an exact sequence of A-modules. Since V4 is 
free of rank 2 over A, an application of Nakayama’s Lemma then gives us 
that (V4)! is a cyclic A-module, possessing trivial annibilator by (15) since 
(Vc)! is free of positive rank over C. That is, (V4)! is also free of rank one 
over A, and we have shown all we need to show to conclude that p satisfies 
condition (iii) of Proposition 1. 


To conclude the proof of Proposition 3 we must consider the “difference 
cocycle” c/, of a lifting p’ of p to Ale], as in §18 and, by a direct calculation, 
show that p’ is J-ordinary if and only if it takes its values in the sub A- 
module Hom,(V/V!,V) C End,(V). We leave this as an exercise. 


§31. The condition of being “finite flat.” Here we consider the case 
of K, = Qz, the field of @-adic numbers, with 24 2. Let Q, be a choice 
of algebraic closure of Qe. As usual, Ga, denotes Gal(Q,/Qz). If Aisa 
coefficient-A-algebra, and 


p: Ga, > GLy(A) 


is a continuous representation, we say that p is finite flat (we probably 
should rather call it pro-finite fiat) if for every artinian quotient Ag of A 
the induced representation pp : Gq, — GLw(Apg) has the property that its 
representation space V(p9), viewed as finite abelian group with Go,-action, 
is the Qy-Galois module of Q,-rational points of some finite flat group 
scheme over Spec(Ze). If D is the condition of being “finite flat,” then D 
satisfies Ramakrishna’s “categorical conditions” (cf. §25) and therefore is 
indeed a “deformation condition.” Also by Proposition 2 of §25, if A is 
artinian and p is finite flat coming from the finite flat group scheme M 
over Spec(Ze) we have 


Hy(Go, ' End(V)) = Extgpec(Ze) (M, M) 


DEFORMATION THEORY OF GALOIS REPRESENTATIONS 309 


where V, the representation space of p, is the Gg,-module M(Q,) and 
where Extgpec(z,)(—,—) means Ext in the category of finite flat group 
schemes over Spec(Ze). In contrast, however, with the other local con- 
ditions we have considered, the calculation of Extgp.c(z,)(M@, M) is more 
difficult and at present it has only been carried out in detail, in the specific 
case of interest; namely for N = 2. This calculation depends on Fontaine’s 
articles [Fo 1], [Fo 2]; see also the article by Fontaine-Laffaille [F-L]. See 
[W], and for a treatment of finite flat groups over finite field extensions of 
Q, with ramification index < £—1, see B. Conrad’s Princeton Ph.D thesis, 
and his article [Co] in this volume. 


A brief glossary of terminology and notation 
(listed by the section in which they make their first appearance) 


81: Gx,s, p-finiteness condition, Galois representation. 

§2: coefficient-ring, coefficient-ring homomorphism. 

§4: residual representation p. 

§5: deformation of p to a coefficient-ring A, strict equivalence class, 
the categories C(A), C(A), A-augmentation, the functor D,, the 
deformation problem for p. 

89: universal deformation, the universal deformation ring R(p) attached 
to a residual representation p, universal deformation space. 

§11: the functors Dy, D7 a. 

§14: the categories C(A), C(A), pro-representable, fiber product, Carte- 
sian diagram, Mayer-Vietoris property. 

§15: Zariski tangent (k-vector) space, tr,tp the “Tangent Space Hy- 
pothesis” (T,). 

§16: The Zariski tangent A-module, the “Tangent A-module Hypothe- 
sis” (T 4), tD,A- 

§17: Continuous Kahler differentials, ORK = Ora, tp,a = tp. 

818: Schlessinger’s conditions (H 1)-(H 4), nearly representable, pro- 
representable hull, smooth morphism of functors. 

§19: relatively representable subfunctor. 

§20: weak near representability, strong near representability. 

§23: deformation condition, F(A; II), DF (A; ID), type D. 

§24: “determinant 6.” 

§25: Repa(II), PFw. 

§26: Global Galois deformation problem. 

§29: minimally ramified. 

§20: ordinary, [-ordinary. 
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EXPLICIT CONSTRUCTION OF 
UNIVERSAL DEFORMATION RINGS 


BART DE SMIT AND HENDRICK W. LENSTRA, JR. 


1. Introduction 


Let G be a profinite group and let k be a field. By a k-representation of G 
we mean a finite dimensional vector space over k with the discrete topology, 
equipped with a continuous k-linear action of G. If V is a k-representation 
of G and A is a complete local ring with residue field k, then a deformation 
of V in A is an isomorphism class of continuous representations of G over 
A that reduce to V modulo the maximal ideal of A; precise definitions are 
given in Section 2. We denote by Def(V, A) the set of such deformations. 

Let V be an absolutely irreducible k-representation of G. The object 
of this chapter is to give a straight-forward construction of a ring R, the 
universal deformation ring, which represents the functor Def(V,—). In 
a purely algebraic setting, without considerations of continuity, a similar 
construction was already given by Procesi in the seventies [9, Chap. IV, 
Lemma 1.7; 10]. The existence of R in the present context was deduced 
first by Mazur [8] with Schlessinger’s criteria for pro-representability [12]. 
An alternative construction was given recently by Faltings (see [5] and 
Section 7 below). 

The main result of this chapter, formulated below as Theorem (2.3), is 
actually a little more general than Mazur’s. Following Schlessinger, Mazur 
works only with noetherian rings, and this forces him to assume at the 
outset that a certain cohomology group is finite. For our argument, the 
noetherian condition is a hindrance, and we find it more convenient to 
follow Grothendieck [6] and work with not necessarily noetherian rings 
that are projective limits of artinian rings. This allows us to drop Mazur’s 
cohomological condition; it reappears only at the end, as a necessary and 
sufficient condition for R to be noetherian. 

Our construction of R proceeds in three steps. First we let G be finite, 
and we consider the functor that assigns to A a certain set of homomor- 
phisms G — GI,,(A). Proving that this functor is representable is very 
easy: one just defines the corresponding ‘universal’ ring by generators and 
relations. Next, we take a projective limit and obtain a similar result for 
arbitrary profinite G (Proposition (2.5)). To conclude the construction, we 
pass to the closed subring generated by the traces of the elements of G; the 
proof that this ring has the required properties makes use of an argument 
of Serre [3, Théoréme 2]. 

It is in the last step of the construction that the absolute irreducibility 
of V is crucially used. In Wiles’s proof of Fermat’s Last Theorem the 
existence of deformation rings is only needed for such V. Wiles also uses 
the fact that such deformation rings are generated by traces [13, pp. 509- 
512], so the approach above is particularly suitable for Wiles’s applications. 
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It is, however, of interest to observe that the universal deformation ring 
also exists when V, instead of being absolutely irreducible, satisfies the 
weaker condition End,jg)(V) = k. In the noetherian case this was shown by 
Ramakrishna [11], as a consequence of Schlessinger’s criteria. The genera] 
case is proved in Section 7. Instead of taking the subring generated by the 
traces we pass to the subring generated by a larger collection of elements, 
as suggested by an argument due to Faltings [5, Section 2.6]. We do not 
know whether a similar result holds in Procesi’s purely algebraic setting. 

Following Ramakrishna [11] we indicate in Section 6 how one can im- 
pose additional conditions on the deformations to obtain “ordinary” and 
“flat” deformation rings. 


2. Main results 


We denote the maximal ideal of a local ring A by ma. 


(2.1) Local complete rings. Let © be a noetherian local ring with 
residue field k. We denote by C the category of local topological O-algebras 
A that satisfy the following two conditions: the natural map O > A/m, 
is surjective (so that k is also the residue field of A), and the map from 
A to the projective limit of its discrete artinian quotients is a topological 
isomorphism. Equivalently, the second condition asserts that A is complete 
and that its topology can be given by a collection of open ideals a for which 
A/ais artinian. Morphisms in C are continuous O-algebra homomorphisms. 


(2.2) Deformations. Let O and k be as above, let A be a ring in C, 
and let G be a topological group. A representation of G over A, or an 
A-representation of G, is a finitely generated free A-module M with a 
continuous A-linear action; here we give M the product topology via an 
A-module isomorphism M =, A”, a topology that is independent of the 
choice of the isomorphism. Two A-representations M and M’ are said to 
be isomorphic if there is an A[G]-module isomorphism M —> M’, and we 
denote this by M =jq) M’. 

Let V be a k-representation of G. By a deformation of V in A we mean 
an isomorphism class of A-representations W of G for which W @ak =xiq 
V. The set of such deformations is denoted by Def(V,A). A morphism 
f:A — A’ in C gives rise to a map f,: Def(V, A) — Def(V, A’) that sends 
the class of a representation W over A to the class of W @,4 A’. 


Throughout the paper V is a representation of a profinite group G over 
the residue field k (with the discrete topology) of a noetherian local ring 
O, and C is as above. 


(2.3) Theorem. If V is absolutely irreducible then 

(1) there are a ring R inC and a deformation D € Def(V, R) such that for 
all rings A in C we have a bijection Homce(R, A) —> Def(V, A) given 
by f +> f.(D); 
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(2) the pair (R, D) is determined up to unique C-isomorphism by the prop- 
erty in (1); 

(3) the ring R is noetherian if and only if dim, H'(G,End,;(V)) < 00; 

(4) if R is noetherian then the following hold: R is mr-adically complete 
and for each A in C we have a well-defined bijection 


Homo.aig(R, A) + Def(V, A) 


given by f +> f,(D). 


Recall that V is absolutely irreducible if V @, K is a simple K[G]-module 
for every field extension K of k. The H! in (3) denotes the continuous 
cohomology group of the discrete G-module End;,(V), on which the G- 
action is given by (gy)(v) = gy(g—'v) for py € End,(V) and v € V. By 
“Homo-Aig” we denote the set of O-algebra homomorphisms. 

Statement (2) of the theorem follows from (1) by the standard unique- 
ness argument for universal objects. Statement (4) will follow immediately 
from (1) and the following proposition. 


(2.4) Proposition. Suppose A is a noetherian ring inC. Then the topol- 
ogy on A is equal to the m,-adic topology, and A is m,-adically complete. 
Furthermore, every O-algebra homomorphism A — A’ with A’ in C is 
continuous. 


The proof of (2.4) and the proof of part (3) of (2.3) are postponed to 
Section 5. By (2.4), the category C’ whose objects are complete noetherian 
local O-algebras with residue field k and whose morphisms are O-algebra 
homomorphisms is a full subcategory of C. We will use later that a-closed 
sub-O-algebra A’ of a ring A in C is again in C, which follows from the fact 
that a sub-O-algebra of an artinian ring in C is again an artinian ring in C. 
However, if A is in C’ then A’ need not be in C’. 

We will show (1) by an explicit construction, which starts by repre- 
senting an easier functor. For this we will write representations as homo- 
morphisms to matrix groups. Let V be any k-representation of G. If one 
chooses a k-basis v1,...,Un for V, then the G-action on V is given by a con- 
tinuous homomorphism p: G — Gl,(k). Now let W be a representation of 
G over some A in C such that W/maW =W@ak =xig) V. By Nakayama’s 
lemma elements w1,..-,Wn € W such that w; ++ v; form an A-basis of W. 
The G-action on W is then given by a continuous group homomorphism 
p:G — GI,(A) such that the composite map G — Gl,(A) — Gl,(&) is p. 
We denote the set of such maps p by CHom,(G,Gl,(A)). Here “CHom” 
denotes the set of continuous homomorphisms, and the subscript p ex- 
presses the condition that the homomorphisms considered reduce to p over 
the residue field k of A. 


(2.5) Proposition. There are a ring R, inC and a map 
py € CHom,(G, Gl, (Re)) 
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such that for each A in C we have a bijection 
Homec( Ry, A) —+ CHom,(G, Gl, (A)) 
that sends a C-morphism f to the composite map 
CGR) Gia). 


The pair (Rp, pp) is determined up to unique isomorphism by this property. 


The ring R, will be constructed in Section 3 as a projective limit over the 
discrete quotients of G ef complete Q-algebras that are explicitly defined 
by generators and relations. The map p, defines a representation W, = R? 
of G in Rp such that Wy @r, k =xiq]) V. We now let R be the smallest 
closed sub-O-algebra of R, that contains the traces of all matrices p,(g) 
with g € G. Note that R is in C again. The following result asserts that 
we can define the representation W, of G over the subring R. We let D be 
the R[G]-isomorphism class of this R-representation. 


(2.6) Proposition. Let W be a representation of G over some ring A in 
C and let A’ C A be an inclusion of rings in C so that A’ has the induced 
topology of A. Suppose that A’ contains the traces of all endomorphisms of 
W that are given by multiplication with an element of G, and suppose that 
W @.4 A/ma is absolutely irreducible. Then there is an A’-representation 
W’ of G such that W' @4 A =ajq) W. 


Proposition (2.6) is a variation of results due to Serre [3, Théoréme 2] and 
Mazur [8, Proposition 4]. 

Let us assume (2.6) for the moment and prove that the pair (R, D) 
satisfies statement (1) of the theorem. Let W be a representation of G over 
a ring A in C for which W @4 k xg) V- Choosing a basis of W as in the 
argument before (2.5), one can give the G-action on W by a continuous 
homomorphism p € CHom;(G,GI1,(A)). By (2.5) there is a C-morphism 
fo: Ry — A such that the composite map G % Gl,(R) “% Gl,(A) is 
equal to p. Then the restriction f:R — A of f, has the property that 
f.(D) is the A[G]-isomorphism class of W. 

The trace of an element of G in some representation of G depends 
only on the representation up to isomorphism. Given f,(D) the map f is 
therefore uniquely determined on the traces of p,(g) for all g € G. But the 
O-algebra generated by these traces is dense in R, and f is continuous, so 
f is uniquely determined. This proves the universal property (1) in (2.3) 
once we know (2.5) and (2.6). 
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3. Lifting homomorphisms to matrix groups 


In this section we prove (2.5). The last statement in (2.5) follows by the 
usual uniqueness argument. 

Suppose first that G is finite, and denote its identity element by e. We 
define O[G, n] to be the commutative O-algebra given by 


generators: Xj, forgé Gand1<i,j <n; 
; 1 ifi=j 

x : — , 

relations: AG { 0 fees: 


XH = SUXEXG for gh G and 1 <i,j <n. 
t=1 


For example, O[G, 1] is just the group ring of the largest abelian quotient 
of G over O. 
For every O-algebra A we have a canonical bijection 


(3.1) Homo-aig(O[G, n], A) = Hom(G, Gl, (A)), 


where an O-algebra homomorphism f:O[G,n] — A corresponds to the 
group homomorphism p; that sends g € G to the matrix (f(Xj,)), ,- 

By (3.1) the homomorphism jp: G — Gl, (k) gives rise to an O-algebra 
homomorphism O[G,n] — k. Its kernel is a maximal ideal, which we denote 
by mz. Now let Rp, be the completion of O[G,n] at mg. Certainly Ry is 
noetherian and lies in C. The canonical map O[G,n] — Ry gives by (3.1) 
a map py:G — G1, (A,) such that the diagram 


Ge. 285 Gl, (Rs) 


| 


Ge 


commutes. 

To prove that the map in (2.5) is a bijection, let A be a ring in C 
and let p € CHom,(G,GI,(A)). By (3.1), there is a unique O-algebra 
homomorphism f: O[G,n] — A such that ps = p. The fact that p- reduces 
to p modulo my implies that f(mz) C m4. The topology on A is given by 
open ideals a for which A/a is artinian, and the map O[G,n] — A— A/a 
is continuous for the mz-adic topology on O[G,n] for each such a. We 
therefore obtain a continuous O-algebra homomorphism f:R, — A for 
which the diagram 

G ©. Gl,(Re) 
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commutes. Since the elements f (X{,) are determined by p, and the Xj, 


generate a dense sub-O-algebra of Ry, the map f is uniquely determined 
by the conditions that it be continuous and that the diagram commute. 
This finishes the proof of (2.5) in the case that G is finite. 

For the general case, write G as G = lim H, with H ranging over those 
discrete quotients of G for which the representation p: G — Gl,,(k) factors 
through a map py: H — Gl, (k). Each Z is finite, so the construction above 
produces a ring Ry in C with a group homomorphism H — Gl,(Ry) that 
reduces to Py: H — Gl,(k). Using (2.5) for each H we get a projective 
system (Ry)y in C. 

Now let Ry, = lim Ry. We have a continuous map py:G — Gl,(Re) 
induced by the composite maps G — H — Gl,(Rx). For fixed H, the 
images of the defining generators of O[H, n] generate each discrete artinian 
quotient of R; over O. But these images are contained in the image of Rp, 
so R, surjects to each discrete artinian quotient of Ry. Moreover, each 
discrete artinian quotient of Ry arises in this way. In particular it follows 
that Ry lies in C. 

Let A = lim A; be a ring in C written as a projective limit of its 


discrete artinian quotients. We now have canonical isomorphisms 


CHomp(G,Gl,(A)) = lim CHoms(G, Gla(A;)) 
~ lim a Hom,,, (H, Gln(A:)) 
= lim lim, CHomo-aig( Rx, A:) 


o_~ 
* 
~_ 


II 


lim CHomo-Aig(Ro, Ai) 
~ CHomo.aig(Ro, A). 


For («) we use that a continuous homomorphism R, — A; factors over some 
artinian quotient R’ of Ry, and that R’ can be chosen to be an artinian 
quotient of some Ry. This proves (2.5). 


4. The condition of absolute irreducibility 


In this section we show (2.6). Let V = W @,4 k. The G-action on V gives 
an O-algebra homomorphism p: k[G] > End,(V). The irreducibility of V 
implies that D = End,jg)(V) is a division ring, and since V is absolutely 
irreducible, the tensor product D ®, K = Endxjg(V @x K) is also a 
division ring for any field extension K of k. This implies that D =k. By 
Wedderburn’s theorem [7, chap. XVII, 3.5] one then deduces that k[p(G)] = 
End, (V). 

Choosing a k-basis of V we may identify the k-algebra End,(V) with 
the ring M,(k) of n x n-matrices over k. Let €), ..., €,2 be a k-basis 
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of End,(V) for which each matrix é; has exactly one non-zero entry. We 
denote the trace of an endomorphism f of a finitely generated free module 
over a ring R by Trr(f). An easy computation shows that the determinant 
of the matrix (Tr, (€:€;))i,, € M,2(k) does not vanish. 

Let B be the sub-A’-algebra of End4(W) generated by the image of G. 
Denote the natural map End4(W) — End,(V) by vy. Then we have y(B) = 
k[p(G)] = End,(V), so we can choose e; € B such that p(e;) = @;. Since 
y induces an isomorphism End4(W) @4 k—> End,(V), it follows from 
Nakayama’s lemma that the e; form an A-basis of End4(W). We claim 
that they also form an A’-basis of B. Indeed, if we write an element b € B 
on this basis as b = }°, a,e; with a; € A, then we have 


n? 


So ai Tra(ece;) = Tra(be;) € A’, 


i=1 


because Tr4(B) C A’. The coefficient matrix (Tr4(eie;))iy € Mn2(A’) is 
invertible, because it is invertible modulo m4. Therefore all a; lie in A’, 
which proves our claim. It follows that B@,4 A = End,(W). 

Choose an idempotent 7 in the ring End,(V) that generates a minimal 
left-ideal; e.g., take a matrix with one diagonal entry equal to 1 and all other 
entries equal to 0. We claim that there exists 7 € B such that y? = 7 and 
y(n) = 7. Ifz € B and! > 1 are such that s = x? modm),,B, then it 
is easy to check that f(r) = 3x? — 2x3 satisfies f(z) = xmodm',,B and 
f(x)? = f(z)modm%,B. Now choose any no € B with y(n) = 7 and 
consider the sequence 7, f(o), f(f(m0)), ---- This is clearly a Cauchy 
sequence for the m,-adic topology on B. But A’ is a projective limit of 
artinian rings, so its m,-adic topology is at least as strong as the given 
topology on A’, for which it is complete. This means that the sequence is a 
Cauchy sequence for the product topology on the free A’-module B, so that 
the sequence converges to a limit 7 in B. This 7 satisfies our conditions. 

We have Bn ® B(1—7) = B, and B is a free A’-module. It follows 
that the B-module W’ = Bn is also free over A’, and from y(n) = 7 
we see that its rank over A’ equals dim,(End,(V)7) = n. Choose an 
element wo of W whose image wv in V satisfies 79 # 0. Then we have 
End, (V)7v9 = V, so Nakayama’s lemma implies that the End,4(W)-linear 
map W’ @4 A = End4(W)n — W sending o to ow is surjective. By 
checking A-ranks one sees that it is an isomorphism. It follows that W 
and W’ @, A are isomorphic over B @,- A, and in particular they are 
A[G]-isomorphic. It also follows that the G-action on W’ is continuous. 0 


The following result will be needed for the proof of part (3) of (2.3). 


(4.1) Lemma. Let A be a local ring with residue field k and let G be 
a group. Let p:G — Gl,(k) be a group homomorphism that makes k” 
into an absolutely irreducible k[G]-module. Then two elements p,p’ € 
Hom;(G,G1,,(A)) define isomorphic A[G]-module structures on A” if and 
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only if there is a matrix M € Gl,(A) reducing to the identity matrix in 
Gl,(k) such that p(g) = Mp'(g)M~* for all g € G. 


Proof. The only non-trivial point is the following: if there exists M € 
Gl,(A) such that p(g) = Mp'(g)M7 for all g € G, then M can be chosen 
so that its reduction M € Gl,(k) is the identity matrix. Note that M lies 
in Aut,g(k”), which by the first paragraph of the proof above is just k*. 
But the scalar matrix M can then be lifted to a scalar matrix T in Gl,,(A), 
and we can now replace M by MT™}. oO 


5. Projective limits 
In this section we show (2.4) and statement (3) of (2.3). 

Let A be a ring in C which is given as a projective limit lim A; of a 
collection of discrete artinian quotients, where 7 ranges over some directed 
index set. We let m and m; be the maximal ideals of A and A,. 


(5.1) Lemma. Suppose that we have a sequence of projective systems 
(Mz) — (M7) > (M?) 


which for each i is an exact sequence of finitely generated A;-modules. 
Assume also that for each i’ < i and j = 1, 2, 3, the transition map 
M? — Mz}, is A;-linear. Then the induced sequence 


lim M} — lim M? > lim M3 


is an exact sequence of A-modules. 


Proof. The projective limits are A-modules by the condition on the tran- 
sition maps. It is clear that the maps between them are A-linear, and that 
the composition of the two maps is zero. 

Suppose that (x;); is an element in the kernel of yw. Let 


E;, = {x € M}: 2 a;}. 


We need to show that lim Ei is non-empty. In the case that k is finite 
one can see this by remarking that [[,; EF; is compact, and that lim E; is 
the intersection of a collection of closed subsets with the property that any 
finite subcollection has a non-empty intersection. 

For the general case the reader is referred to the criterion for projective 
limits to be non-empty given in Bourbaki [2, III.7.4, Théoréme 1]. To apply 
this criterion one lets G; be the set of subsets of EF; of the form x + N, 
where x € FE; and where N is a sub-A;-module of the kernel of the map 
M} — M? (see also [2, loc. cit., Exemple II]). O 


(5.2) Remark. With a similar argument we will show the following, 
which will be used in Section 6. If X is a collection of open ideals I of A 
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which is closed under taking finite intersections, then the canonical map 
g: A> A’ = lim |, A/I induces a topological isomorphism A/F — A’, 
where F’'=();-y I. Clearly, y is continuous, and Ker = F’. Suppose first 
that k is finite. Then A and A’ are compact and (A) is a dense compact 
subset of A’, so y is surjective. A continuous bijection between compact 
Hausdorff spaces is a homeomorphism, so our claim follows. 

Let us sketch the argument for general k. For J € X let A/ be the 
cokernel of the map J — A;. Since A; is artinian, it surjects to lim ‘ Al, 


and by (5.1) the ring A surjects to lim. lim , AJ = lim, lim, AJ. Since 
et o— I a — I e— 72 a 
I is open we have lim A! = A/T, and it follows that is surjective. In 
-—i 


the same way one shows that the image in A’ of any open ideal a of A is 
lim ; (a+ I)/I, which is open in A’ because by (5.1) it is the kernel of the 


continuous map from A’ to the discrete ring lim , A/(a+J). Thus, ¢ is an 
Sa 
open map, and the map A/F — A’ is a homeomorphism. 


(5.3) Proposition. The following two statements are equivalent: 
(1) A is noetherian; 
(2) dim,(m;/m?) is a bounded function of i. 
If they hold, then the following are also true: 
(3) m* = lim m? for all a > 0; 
<_—_ 
(4) the topology on A is the m-adic topology. 


This proposition implies (2.4). To obtain the last statement of (2.4), write 
A’ = lim A; with Aj artinian and note that for each 7 the map A — 


A’ — Al is continuous in the m-adic topology on A. We already used this 
argument to show (2.5) in the case that G is finite. 


Proof. Suppose that A is noetherian. Then m can be generated as an 
A-ideal by a finite number d of elements of m. Since m surjects to m; we 
have dim,(m;/m?) < d for each i, so (1) implies (2). 

Now assume that (2) holds. We need to show (1), (3) and (4). We 
start with (3). The statement is trivial for a = 0, and we will proceed 
by induction on a. Assume (3) holds for a and consider the sequence of 
projective systems 


0 — mot! —. m? — m2/met! — 0. 


Assumption (2) implies that m?/m?* also has bounded dimension, so the 
system on the right stabilizes, i.e., all transition maps for 7 > 7 are isomor- 
phisms if 7 is large enough. This implies that its limit is a finite dimensional 
k-vector space N. By (5.1) and the induction hypothesis we have a short 
exact sequence 


+1 


(*) 0— lim mf — m* — N — 0. 
hr 


a 
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Choose elements }),...,5; of m* whose images in N form a basis of N 
over k. For each i we have a surjection Al — m¢, sending (xj,...,21) to 
z,b, +---+2,b;. Taking limits we deduce from (5.1) and the induction 
hypothesis that m* is generated by b,,...,b; as an A-ideal. We now have 
1 > dim,(m?/m?t!) > dim,(N) = 1, so m**? is equal to the kernel of the 
map m* — N. By the sequence (*) above, this gives the induction step. 
This shows (3). 
Applying (5.1) to the sequence 


0— mi — A; — A;/mi — 0 


and using (3) we get A/m® = lim A;/m}. Again with (2) one sees that this 
system stabilizes. But this means that the map A — A/m* factors through 
A; for some 7, so that m® is open in A. We already mentioned in Section 4 
that the m-adic topology on a ring in C is at least as strong as the given 
topology, so in this case the two topologies coincide. This shows (4). 

We now know that A is m-adically complete, and that m is a finitely 
generated A-ideal. To prove that A is noetherian we use a standard ar- 
gument, which also goes into the proof that a completion of a noetherian 
ring is noetherian. The graded ring G(A) = @,,3,m™/m™*? is a finitely 
generated k-algebra, which is noetherian by Hilbert’s basis theorem. By 
[1, (10.25)] this implies that A is noetherian. This shows (1). 0 


Proof of part (3) of (2.3). We consider deformations of V in the ring 
A= kle] with «? =0. Write R as a projective limit of its discrete artinian 
quotients R;. Let m; be the maximal ideal of R;. One easily sees that 


Home (R, k[e]) = lim Homo aig( Ri, kel) 


= lim Hom, (m;/(m? + mo R,), k). 


Let us denote the rightmost set by T, and note that T’ is a vector space 
over k. Recall that O is noetherian, so that the k-dimension d of me/mé, 
is finite. Clearly dim,(m;/(m? + moR,)) and dim,(m;/m?) differ by at 
most d. Since the transition maps in the injective limit are injective, the 
dimension of T is finite if and only if the dimension of m;/m? is bounded, 
which by (5.3) is equivalent to R being noetherian. 

By part (1) of (2.3) the set Def(V, k[e]) can be identified with T, so 
after choosing a basis of V over k one gets a surjection 


CHom,(G, Gl, (k[e])) — T- 
We have Gl,(kle]) = Gl,(k) ® M,(k)e, and one easily checks that the 


homomorphisms on the left are exactly the maps g ++ (1 + c(g)e)p(g) for 
which c:G — M,(k) is a continuous 1-cocycle. Moreover, it follows from 
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(4.1) that two 1-cocycles give the same deformation in k[e] if and only if they 
differ by a coboundary, so that we get a bijection H!(G, End,(V)) —> T. 
In the case that k is finite, statement (3) follows at once. For the general 
case one checks that this bijection is k-linear, so that the same conclusion 
holds. O 


6. Restrictions on deformations 


In this section a class of additional properties of deformations is identified 
for which one gets a representable sub-functor of the deformation functor. 

Suppose that for each ring A in C a subset S(A) of Def(V, A) is given 
such that for each A in C and D € Def(V, A) the following hold: 


(1) we have D € S(A) if and only if D/aD € S(A/a) for all open ideals 
a# Ain A; 

(2) if a and b are open ideals # A of A such that D/aD € S(A/a) and 
D/bD € S(A/6), then D/(anb)D € S(A/(anb)); 

(3) if AC A’ is an inclusion of artinian rings in C, then D € S(A) if and 
only if D@, A’ € S(A’). 


(6.1) Proposition. For any C-morphism f: A — A’ we have f,(S(A)) C 
S(A’). If V is absolutely irreducible, then there is a closed ideal a of the 
universal deformation ring R such that the map Homc(R, A) —> Def(V, A) 
in (2.3) induces a bijection Homg(R/a, A) —> S(A). 


Proof. Let A be a ring in C and D € Def(V, A). Using (5.2) one deduces 
from conditions (1) and (2) above that there is a unique closed ideal a? of 
A such that for every open ideal a of A we have D/aD € S(A/a) if and 
only a > a. By condition (1) we have D € S(A) if and only if a? = 0. 

Now let f: A — A’ be a C-morphism and put D’ = D @, A’, where 
the tensor product is taken via f. Let a’ be an open A’-ideal and write 
a = f~+(a’). By condition (3) we have D’/a’D’ € S(A’/a’) if and only 
if D/aD € S(A/a). Therefore, a?’ C a’ if and only if f(a) C a’. In 
particular, D’ € S(A’) if and only if Ker f contains a®. 

The first statement of the proposition now follows at once, and by 
taking a = a? C R, where D is the universal deformation, we obtain the 
second statement. 0 


(6.2) Ordinary deformations. Suppose that J is a closed subgroup of G. 
A 2-dimensional representation W of G over a ring A in C is said to be 
ordinary if the sub-A-module W! of J-invariants is a direct summand of 
W of A-rank 1 (cf. [8, 1.7]). Suppose that V is 2-dimensional, absolutely 
irreducible, and ordinary. We want to show that the ordinary deformations 
form a representable functor on C. 

Using the fact that V is ordinary one can see that D € Def(V, A) is 
ordinary if and only if the J-action on D is given by matrices G e) on 
a suitable A-basis of D, and if and only if D’ contains an element z not 
mapping to 0 in V. Now choose an element go € J that does not act 
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trivially on V. Then one checks that D is ordinary if and only if D is 
annihilated by the elements (g — 1)(go — detp(g0)) € A[G] with g € I (for 
the if-part, choose z = (go — det p(go))y for suitable y). It is easy to verify 
that conditions (1)-(3) hold for this latter property. 


(6.3) Flat deformations. Assume that k is a finite field of characteris- 
tic p. Let K be finite field extension of the field Q, of p-adic numbers, let 
Ox be its ring of integers, and let G = Gal(K/K), where K is an algebraic 
closure of K. We say that a Z[G]-module of finite cardinality is flat if it G- 
isomorphic to the group of points in K of a finite flat group scheme over Ox. 
The flatness property is preserved under passing to finite products, sub- 
modules, and quotients [11; 4]. Let us sketch the argument. For products 
it is clear. Suppose that X’ C X are Z[G]-modules and that X = G(K) for 
a finite flat group scheme G = Spec A over Ox. Let I be the kernel of the 
map A — |],<x, K. The comultiplication m*: A — A@A induces a comul- 
tiplication on A’ = A/I and on A” = {x € A: m*(xz) =z @lmodAQ@I}. 
Then G’ = Spec A’ and G” = Spec A” are finite flat group schemes over 
Ox and one checks that G’(K) = X’ and G’(K) = X/X’. 

A deformation of V in an artinian ring A in C is said to be flat if it is 
flat as a Z[G]-module. Use condition (1) to define flatness for deformations 
to arbitrary rings A in C. Then one easily checks (2) and (3). For (3) one 
notes that D’ contains D as a sub-Z[G]-module, and that D’ is a quotient 
of a finite product of copies of D. Thus, the flat deformation functor on C 
is representable if V is absolutely irreducible and flat. 


7. Relaxing the absolute irreducibility condition 


In this section we will show that our main result already holds when 
Endxig)(V) = k. We saw in Section 4 that this is a weaker condition 
on V than absolute irreducibility. This improved result will not be needed 
in the rest of this book. 


ee Proposition. If End,jg|(V) = k then statements (1)-(4) of (2.3) 
old. 


Proof. We will use the same construction as before, but we need to pass 
to a different subring of R,: we may need more elements than the traces 
of the actions of the group elements. In order to describe a suitable set of 
elements we explain Faltings’s notion of “well-placed” representations. 
We choose a basis for V over k, so that the G-action on V is given by 
a continuous group homomorphism ~: G — G1,(k). Since M,,(k) is finite- 
dimensional over k, we can choose a finite number of elements gi, ..., g- in 
G such that the only matrices in M,,(k) commuting with all f(g;) are the 
scalar matrices. Let a lift E; € M,(O) of each f(g;) be chosen. For any ring 
A in C we let M°(A) be the matrix ring M,,(A) modulo scalars; this is a 
free A-module of rank n? —1. By Nakayama’s lemma one sees that we have 
a split injection i4: M®(A) — M,(A)" given by M + (ME; — E;M)¢.}. 
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We now choose a splitting 1 of 26 once and for all. We have a canonical 
isomorphism M°(A) ~ M°(O) @@ A, and m4 = To @ ida is a splitting of 
ia. Consider the composite map 
CHom;(G,Gln(A)) —> M,(A)" 7 M2(A). 

p > (e(9:) iat 
We say that p is well-placed if its image in M2(A) is to(f,...,#-) @1. 
(7.3) Lemma (Faltings). For every p € CHom,(G,GI,,(A)) there is a 


matrix M € Gl,(A) reducing to 1 € Gl,(k) so that MpM~? is well-placed. 
This matrix M is determined uniquely modulo 1 + ma. 


(7.2) 


Proof. Putm = ma. With induction to m we first show the lemma under 
the hypothesis that m™ = 0. For m = 1 this is clear. To make the induction 
step for m > 2 we can assume by the induction hypothesis that p is well- 
placed modulo m™~!. We are done if we show that (1+ M)p(1+M)~? is 
well-placed for a unique M € M2(m™—!) = m™-1M°(A), and this follows 
from the fact that the maps in (7.2) respect suitable actions of M°(m™~!): 
we let M € M°(m™') act by conjugation with 1+ M on the leftmost set, 
by translation with 74(M) on the middle group, and by translation with 
M on M2(A). 

To obtain the general case one refines the conjugating matrix mod- 
ulo increasing powers of m (recall that an m-adic Cauchy sequence in A 
converges to a unique limit in A even if A has a coarser topology). O 


We apply the lemma to the deformation p, of Proposition (2.5), and we 
let p be the well-placed conjugate of p,. Define R to be the smallest closed 
sub-O-algebra of R, that contains all entries of p(g) for all g € G. Then p 
defines a deformation D of V in R, and we claim that properties (1)-(4) 
of Theorem (2.3) now hold. The map Home(R, A) — Def(V, A) in (1) is 
again surjective. To see injectivity, suppose that for fi, fo € Home(R, A) 
the well-placed composite maps 


jeer VEL (he GA) 


give the same deformation of V in A. By the argument of (4.1) together 
with the uniqueness statement in (7.3) it follows that p; = p2, and by the 
definition of R this implies that f; = f.. The proofs of (2) and (4) are as 
before. For (3) we just remark that the argument at the end of Section 5 
showing that H'(G,End,(V)) = T, only uses that Endyig(V) = k. This 
proves (7.1). O 
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HECKE ALGEBRAS AND THE 
GORENSTEIN PROPERTY 


JACQUES TILOUINE 


Université de Paris-Nord 


The goal of this paper is to show the importance of the Gorenstein 
property for the Hecke algebra and its relation with the local freeness of 
the cohomology of modular curves as a module over the Hecke algebra. 

Hence the text revolves around Section 2.1 of [W4]. The main theo- 
rem of this part of Wiles’ paper (Theorem 2.1 of Section 2.1 in [W4]) is 
used crucially in the proof of the Shimura-Taniyama-Weil conjecture for 
semistable curves in two instances: 

(i) On line 2, page 559 of Taylor-Wiles paper [TW], to insure that it 
is enough to prove that a cohomology group is free over the group 
algebra O[A] in order to conclude that the Hecke algebra is free 
over O[A]. 

(ii) In the proof of Proposition 2.15 of [W4] (page 507) , which in turn 
is used to prove Theorem 3.1. 


The Gorenstein property has evolved in the 40’s—50’s, from the study of 
singular plane curves and in particular of their duality theory. It had been 
used since then in classical algebraic geometry until B. Mazur had the 
idea of its relevance in the study of congruences between modular forms or 
equivalently, of local components of the Hecke algebra. It is first introduced 
as an important tool in [M1]. 

There, among other things, it is shown that if tis a p-ordinary maximal 
ideal of the Hecke algebra T acting on the jacobian J of Xo(N) (for N 
prime), then: 

(A) J[9N] is of dimension two; it is also shown there (by a different 

method) that the completion Ts is a complete intersection, hence: 

(B) Tyr is Gorenstein. 

Moreover, by an elementary argument, given the basic context of [M1], 
one can show directly that (A) is equivalent to (B) [see the Appendix of 
this paper]; therefore, in effect, one has two routes for showing both (A) 
and (B). 

However, in [M1] already, B. Mazur also noticed that the other direction 
was interesting, in the “non-Bisenstein” case. In fact, in that case, one can 
often directly establish that J[St] is 2-dimensional. One must (i) first cut 
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it into two pieces, then (ii) bound by one the dimension of one piece, and 
(iii) finally deduce by some trick that the other as well is one dimensional; 
for these steps one uses the following tools: 


@ If IN is supersingular, one gets (i) by the theory of Dieudonné mod- 
ules. 

@ If MN is ordinary with potential good reduction, one gets (i) by the 
étale-connected dévissage of finite flat group schemes. 

e If IN is ordinary semistable, one gets (i) by Raynaud’s theory of 
semistable abelian varieties. 

@ Then, by various ingenious arguments, (ii) is brought down to the 
question of “weak multiplicity one” in characteristic p. More pre- 
cisely, in each of the points above, one piece of J[Mt| is related to 
H° (X(N) @ Fy, Q)[Mt] so that one needs to see that this space 
is one-dimensional. After g-expansion, this amounts to the above- 
mentioned weak multiplicity one question. 

@ In all cases, one gets (iii) by various concluding arguments (using 
the Brauer-Nesbitt or Krull-Schmidt-Akizuki theorems). 


Actually, this scheme of proof has been used in several papers, such as 
(Wi, (A, (MW), [Ti], [Mei], [w4]. 

In the present paper, we shall start by briefly recalling some properties 
of Gorenstein rings which are relevant to us (Section 1); then, we shall 
focus on the modular situation, defining the local Hecke algebra we want 
to study (Section 2). After these preliminaries we state the Main theorem 
and its corollaries (Section 3). Then we explain the strategy of the proof 
(Section 4). Finally, (Section 5), we give the proof itself according to the 
lines of Section 4. The author wishes to thank B. Mazur and K. Ribet for 
correspondence and discussions which clarified several questions. Finally, 
the author is also thankful to Mrs. C. Simon from Université de Paris-Nord 
who prepared the final form from a rough manuscript. 


1. THE GORENSTEIN PROPERTY 


Let (R,9N%,k) be a triple consisting of a noetherian local ring R with 
maximal ideal 93 and residue field k. For any R-module M and any ideal 
YM of R, we denote by M{[2] the submodule of M consisting of vectors 
annihilated by 21. Let d = dim R be the Krull dimension of R. 


Definition 1.1. We say that R is Gorenstein of dimension d (briefly, R 
is a Gg-ring) if: 
0 ifi<d, 


Ext’,(k, R) = 
) tn (h, R) i. ifisd. 


Comment 1. By [Mat] (Theorem 2.6 and 16.A), condition («) implies that 
a Gorenstein ring is Cohen-Macaulay; moreover by [B], Theorem 4.1, a 
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Cohen-Macaulay ring is a Gg-ring if and only if the injective envelope EF of 
its residue field & is a “dualizing module for artinian R-modules M”; that 
is, there are functorial isomorphisms 


Ext4(M, R) > Homr(M, E) 


for all finite length R-modules M. 


Comment 2. In fact, one can get a better picture of the significance of this 
condition by using Grothendieck’s local duality theory [Gr]. 


Proposition and Definition 1.2. (See {Gr] Propositions 4.9 and 4.10.) 
A dualizing module for R is an injective R-module I such that one of the 
following conditions is satisfied: 


(i) For each artinian R-module M, Homr(M,!) is finitely generated and 
the canonical homomorphism 


M — Hom(Hom(M, I), I) 


is an isomorphism. 
(ii) [Mt] 2s one-dimensional. 
(iii) I is an injective hull of the residue field k = R/IM. 
Example. For R = Zp, one has I = Q,/Zp. 
Grothendieck defines the local cohomology of a finitely generated R- 
module M by 
Hi, (M) = lim Ext),(R/M", M) 


for any 7=0,1,...,d. Let us put 


I = Hé,(M) = lim Ext,,(R/M, M). 


TLEAOO 
One has the proposition (Proposition 4.14 of [Gr]): 


Proposition 1.2. R is a Gg-ring if and only if R is Cohen-Macaulay and 
I= Hg,(R) is dualizing. 


If R is regular, or more generally, if it is locally a complete intersection, 
then the theory of Koszul complexes shows that R is Gorenstein. (See [Ku], 
Proposition 3.22, Chapter VI, or [Gz] page 67.) In particular, if R = Z,[n] is 
generated by one element 7, then it is Gorenstein. Finally, Grothendieck’s 
duality theorem ([Gr], Theorem 6.3, page 85) for Gorenstein rings reads as 
follows: 
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Theorem 1.3. Let R be a complete noetherian local Gg-ring, and let 
I = H4,(R). For any finitely generated R-module N, denote by D(N) = 
Hompr(N,I). Yoneda’s pairing 


Hi;(M) x Ext ?(M, R) — I 


is “perfect” in the sense that for any 1, it induces 
(1) an isomorphism Hj,(M) = D(Ext® “(M, R)), 
(2) an isomorphism Ext%*(M, R) = D(Hi,(M)). 


In this paper, we shall be concerned only with local rings R which are 
finite flat Z,-algebras. These rings are complete noetherian local of dimen- 
sion 1. We put R = R/pR. It is an artinian local ring. We denote its 
maximal ideal by It. 


Proposition 1.4. Let R be a finite flat local Z,-algebra. The following 
statements are equivalent: 

(i) BR is a G,-ring. 

(ii) R is a Go-ring. 

(iii) R[90] is one-dimensional over k. 

(iii)’ R* = Homp,tin(R,F,) is free (of rank 1) over R. 

(iv) Homz,tin(R, Z,) is free (of rank 1) over R. 


Proof. To show the equivalence of (i), (ii), and (iii), we consider the short 
exact sequence of R-modules 
0—-R>R—-R 0. 
We apply the functors Ext}(k, —) and we observe 
Ext,(k, R) = Homr(k, R) = Ext}(k, R) = RM. 


To prove the equivalence of (iii), (iii)’ and (iv), one applies Nakayama’s 
lemma for IN, respectively pR. We leave the details to the reader 


2. HECKE ALGEBRAS 


Let p > 2 be a rational prime, and let N > 1 an integer such that 
ord(V) < 1. We consider the curve X;(V)g which represents the moduli 
problem of generalized elliptic curves over Q with a Q-rational point of 
exact order N (see [K-M] chapter 3, 3.2, and [D-R]). 

It is a geometrically connected smooth projective curve defined over Q. 
Observe that in this model, the oo-cusp is Q(¢y,)-rational but not rational; 
on the other hand, the 0-cusp is Q-rational. Let H C (Z/NZ)* be a 
subgroup, 


Xg = Xi(N)o/H, the quotient by H, 
Jg = Alb(XQg), its jacobian variety, 


HECKE ALGEBRAS AND THE GORENSTEIN PROPERTY 331 


viewed for the covariant functoriality for algebraic correspondences. 

We introduce the Hecke algebra T, defined as the subring of End(Jg) 
generated by the Hecke operators T, and the operators (a) with a € 
(Z/NZ)*/H. If p divides N, we use the Atkin-Lehner notation U, in- 
stead of T,, in order to emphasize that p divides the level. It is well-known 
that T is a finite flat Z-algebra. 

We assume we are given a maximal ideal SM of T satisfying the following 
conditions: 

(1) The residue field k = T/90 has characteristic p. 
(2) There exists a continuous representation 


por : Gal(Q/Q) —> GLo(k) 


which is unramified outside Np, and such that for any @ relatively 
prime to Np, 


Tr par(Frobe) = Ty (mod 9), 
det pon(Frobg) = £(2) (mod Mt). 


(3) par is absolutely irreducible. 


Remark 1. Note that condition (2) implies that if c is a complex conjuga- 
tion, then det psz(c) = —1. This can be seen by choosing = —1 (mod Np) 
(so that £(—-1) = —1). 


Remark 2. Condition 3 will be referred to in the sequel as the non-Eisen- 
stein condition. 

From now on, we assume that JJ is a non-EHisenstein maximal ideal 
satisfying (1), (2), (3). Let Ts; be the completion of T at 90. It is a local 
finite flat Z,-algebra. Let T = T @ Z,; then Toy is a direct factor of the 
semi-local ring T,. Hence it is a projective T,-algebra. 


3. THE MAIN THEOREM 
Let 0 be a maximal ideal of T satisfying (1), (2), (3) as above. 
Definition 3.1. We say that is ordinary if 
T,¢M (ifptN), 


or 
U, ¢M (if plN). 


Let G, C Gal(Q/Q) be a decomposition group at p and I, C G, its 
inertia group. We recall an important result. 
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Theorem 3.2. If St is ordinary, then 


where x2(Ip) = 1 and if p|N, we have 


x(Frob,) =U, (mod IM). 


Comment. This fact was known to Deligne and Serre in the early 70’s 
(letter from Deligne to Serre) .and is prevenin -[H] proposition 4.4. -Note 
that the presentation in [W2] theorem 2.2 treats also the p-adic case. We 
postpone its proof until section 5 below. 


Definition 3.3. A maximal ideal I of T satisfying conditions (1), (2), (3) 
is called Gp-distinguished if it is ordinary with x1 # x2. 


_ Let St be a maximal ideal of T satisfying (1), (2), (3). Let R = Ton and 
R= R/pR. The main theorem of this paper is the following. 


Theorem 3.4. Ifp{N, or if p|N and MN is G,-distinguished, then 
V = J[p\(Q)an 


is free of rank 2 over R, and R is a Go-ring. 
Let us draw some corollaries. 


Corollaries. 
(1) Ris a Gj-ring, and 


Re Homy, tin(R, Zp) 


as an R-module. 
(2) M = Ta,(Jg)m and M* = Homg, 1in(M, Zy) are free if rank 2 over 
Ri. 


(3) Hi(X(C), Zp) is free of rank 2 over Ton. 


Comment. The analogue of statement (3) above is not known in the fol- 
lowing situations: 


(i) before localization at a maximal prime, 
(ii) after localization at an Eisenstein prime (i.e., such that pgp is re- 
ducible), 
(iii) after localization at an ordinary maximal ideal which isn’t Gp- 
distinguished. Actually, the question of whether such an ideal exists 
in level Np is still open (see [Bu]). 
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Proof of the corollaries. (1) This is clear from Proposition 1.2. 
(2) We have M/pM = V, hence by Nakayama’s lemma, we have a sur- 
jective R-linear homomorphism R? -» M. It must be injective because 
it induces an isomorphism after tensoring with Q, and because R is flat 
over Z,. 

For M*, one recall that the Weil pairing induces a perfect Galois-equi- 
variant pairing 


(-, +): Tap(Je) x Tap(Je) —> Zp(1) 
satisfying 
(tz, y) = (z,t*y) for all z,y € Ta,(Jg) andt €T, 

where t* is the image of t by the Rosati involution. 

On the other hand, consider the Atkin-Lehmer automorphism wy (for a 
fixed primitive N*® root of unity ¢) of X,(N) defined by 

we : (E, P)+ > (E/(P),Q), 
where Q is an N-torsion point on the elliptic curve E'/(P), such that 
ez,n(P,Q)=¢ 

for the Weil pairing egy : E[N] x E[N] > py on E[N]). It has the property 


that 
We oto We => t*: 


Therefore if we introduce the new pairing 
[x,y] = (wes, y), 
we obtain, by localization at 30, a perfect R-bilinear pairing 
MxM —Z,. 
Therefore, as R-modules, we have 
M = Homz,-1in(M, Dp) 


Hence M* & R?. 
(3) The transcendental description of the Albanese variety, 


,(X(C), Zp) = Tap(Ja), 


is T,-equivariant. Therefore after localizing at Ut, we obtain (3). 
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STRATEGY OF THE PROOF OF THEOREM 3.4 


There are several cases to examine. They are, chronologically, 


1. pt JN, (the first case studied, see [Mal] section 14, proposition 14.2), 
2. p|N, det pa|, Fu, , and IN is G, distinguished. 
P P 


3. p|N and det pan| =u). 


In this last case, note that G,-distinguishability is automatic, since y2 is 
unramified while x1 | L= w| ;, is ramified. Keeping in mind the application 
p Pp 
to Fermat’s last theorem, we need only study cases (1) and (3). (The 
so-called flat, respectively ordinary, cases.) 


Remark. Note that in case (1), we have also 
det pan| 1, = wl 
and that case (3) does not prevent pg; from being flat — this is what 


happens if we start with a modular elliptic curve E with multiplicative 
reduction at p, but such that 


p| ordp(gz). 
We shall focus on case (3). As for case (1), no change is needed from 


Mazur’s original treatment. 
Let us write N = N’p with p{ N’. 


Step 0. We can replace Xg by 
X\(N',p) = Xi(N)/Ho 
for Hp = (Z/pZ)* Cc (Z/NZ)*. 
Recall Xg is defined as X\(N)g/H for some subgroup H Cc (Z/NZ)*. 
The reason why we can do this reduction is that the maps between the 
jacobian induced by 


a: Xi(N)g—Xq and 8: Xi(N)g — X(N’, p)o 


give rise to isomorphisms on the p-divisible groups localized at SI. Let 
J\(N', p) = Jac(X,(N’,p)). Then 


a: J\(N)[p°] — J[p™]an 
and 


B: Ai(N)[p™ lax — Ji(N", p)[p™ an 
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are isomorphisms by the theory of the fundamental group. The kernel of a 
(respectively G) is a quotient of H or of (Z/pZ)*, but it must be a p-group 
with Galois action whose Jordan-Holder factors are isomorphic to po. This 
is absurd (unless it is 0), since H has multiplicative-type Galois action. 
What is the gain of this reduction? The scheme X,(N’,p)z, defined 
as the moduli space for triples (E, P,C’), where E is an elliptic curve, P 
is a point of order exactly N’, and C is a cyclic subgroup of E of order 
exactly p, has the advantage of being “almost semistable;” that is , its 
minimal regular model (which can be obtained by blowing-up some points 
in the special fiber) is semistable. By this we mean not only that the special 
fiber has ordinary double points, but that the singularities are of type Ap: 


Zy[X, Y]/(XY —p). 


Step 1. (which implies Theorem 3.2.) If 9 is ordinary, then there exists 
a short exact sequence of h[G,]|-modules, 


(t) Oasys) 


such that V® has all its irreducible subquotients ramified, while V™ has all 
its irreducible subquotients unramified. Moreover, 


V* = Hom(V°, up) 


as R[I,|-modules. 


Step 2. Assume that V° is free of R. Then V is also free. Hence V = R? 
and R is a Go-zing. 


Step 3. We may assume by Step 0 that 
X=X,(N',p) and J=(N’,p). 


Let Jz, be the Néron model of J over Zp, and let Jz,[p]* be the largest 
finite flat subgroup scheme of Jz, [p| which is of multiplicative type. Let 
V = Jz, [p|'(Qp)mn- Then V = V®. 


Step 4. V is free of rank 1 over R. 
5. SKETCH OF THE PROOF 
Step 1. One shows actually that the whole p-divisible group 
D = J[p*)|(Q)an 


admits such a decomposition into p-divisible groups D° and D¥, as R[G,]- 
modules. Then one concludes by putting 


V°=D Ip] and V2 = D*¥ fp]. 
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We observe first that there is an isogeny 
Dis I] A; [pm], 
f 


where the product runs over the eigenforms f € S2(I\(N)), and the primes 
$B|p in the field of eigenvalues AK’; are such that the eigensystem modulo P 
defined by f factors through Tor. 

Recall that the abelian variety A; is defined as a quotient of Jj(N’,p) 

on which Hecke operators act by the eigenvalues of f (there is a canon- 
ical embedding of Ky in End(As) @ Q). Therefore, it is enough to deal 
separately with each A, [33°]. 
(a) Suppose first that p divides the conductor of f. Note that the p-part of 
the character of f is trivial, and det pz 6 = xe, where x is the p-cyclotomic 
character and ¢ is a character unramified at p, as can be seen by reduction 
mod $8. Let us show that this implies that A; has purely multiplicative 
reduction B. 

We consider the two coverings 


XQ —5 X\(N’) 


induces by 7 ++ 7 and 7 ++ pv on the upper half-plane. By Albanese 
functoriality, they induce a morphism 


jo Ny 


Let A be the neutral component of Ker@. By comparing the cotangent 
spaces, we see that Ay is a quotient of A. 

By analyzing the Néron model of A over Z,, we see it has purely toric 
reduction, hence so has Ay. The ingredients for this verification are: 

@ Theorem 2.5 of [Ra2], which identifies the connected component of 
the Néron model Jz, of Jg over Zy as Pic?(Mz,), where Mz, is the 
regular minimal model of Xz,. 

@ The rigidity of tori as in [Ral] or [M2], to obtain a decomposition 
of the group-scheme A;[P]z, as 


0— Tz, — A;(B°]z, — Ez, — 0, 
where T and E are p-divisible groups of multiplicative type, respec- 
tively étale over Zp. 


By taking the Q,-points, we obtain the desired decomposition of py. 
(b) Ifp does not divide the conductor of f, then A; is a quotient of J,(N’); 
it has good reduction at p over Z,. If f is ordinary at 8, we consider the 
connected-étale decomposition of the p-divisible group 


0+ C — A, [P~]z, — E— 0. 


HECKE ALGEBRAS AND THE GORENSTEIN PROPERTY 337 


The corank over Ox, 93 of the middle term is 2, so one has to show that EF 
has corank 1. 

To show that corankE < 1, the method of Mazur ([Mal], proposi- 
tion 14.7) is to use the Cartier map 6 which defines the Hecke-equivariant 
injection _ 7 

6: T(N’)[pl(Fp) @ Fp A(X, (N') gp, 0°). 


It is defined, for a linear class [D] of a divisor D such that pD = (9), by 
d. 
6([D] @ 1) = a 


Then considering the g-expansion at co of a global section of A!, and by 
the very definition of Hecke operators, we see that 


H°(X1(N') @,,.0°) [0] 


is one-dimensional (see [Mal], proposition 9.3). To show that the corank 
of F is exactly one, one uses the Eichler-Shimura relations as in [H], propo- 
sition 4.4 (4.17). 

This takes care of the ordinary case. If f is supersingular at p, one makes 
use of the Dieudonné module exactly as in [Ma1], proof of proposition 14.2, 
case 1, pages 114-116. 


Step 1. For the duality statement, we consider the modified Weil pairing, 
[-,-]:VxV—+p,. 


Observe that it is only R[I,]-equivariant because there may be an unrami- 
fied character € such that 


(x7, y"| = (x, ye), 


Nevertheless, Hom(V°, yw) is the maximal quotient of Hom(V®, zp») = V 
with unramified irreducible subquotients. It must therefore by V”. 


Step 2. Let us show that the short exact sequence ({) splits over R (not 
as s sequence of Galois-modules!). 
Let o € I, such that w(c) generates FX. Let w(o) be the Teichmiiller 


lifting in Z* of w(c). Since V is finite, there exists an integer h > 0 such 
that 


V = ker(o — &(c))” @ ker(o — 1)”. 


One can then easily check that the first summand is V°, while the second 
is isomorphic to V”. By assumption, we have 


VLR, 
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and we have proven that, as R-modules, 
VE ~ Hom(V",F,), 
so 7 : 
Ve — Homp, tin(R, F,). 


On the other hand, since p > 2, the complex conjugation c induces 
another decomposition of R-modules: 


V=VteV-. 


So by Krull-Schmidt-Akizuki theorem (see Curtis-Reiner, volume I, theo- 
rem 14.5), we conclude: 


Vt=R and Vt2R* 


as R-modules. (If a module is a sum of indecomposable submodules in 
two different ways, then up to permutation the indecomposable factors are 
isomorphic; this, over any ring). 

We then observe that 


dim, vr [DN] = dim, Vv [SN], 


because V [JN]S*- is a direct sum of copies of psx and because pm is odd, as 
already noticed. Hence both eigenvalues +1 have the same multiplicities 
in Von. 

We therefore conclude that 


dim, R[M] = 1. 


This is a criterion for R to be a Go-ring; and so R* & R. The theorem 
follows. 
Step 3. We have defined in the proof of Step 1 an abelian variety A of 
Ji (N GD); 

A= (ker(.A(N’,p) © J(N’)?))°. 
It is defined over Q and it is stable under the action of T. 

Let B = J/A; it is defined over Q as well and carries an action of T. 
Moreover, it has good reduction over Z,, whereas A has toric reduction 
over Zp. 

Since p > 2, we have by [Ma2], Proposition 1.3, a diagram whose rows 
are short exact sequences of connected finite flat group schemes: 


C= Ap Jbl, => Bk, == 0 


(C) ac |< p|c 


0 ——+ Alpl., —— Jpg, —— Blp]g, —— 0 
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The exactness can be checked on generic fibers by Raynaud’s theorem 
({Ra3]) and the assumption p > 2 (hence 1 = e < p—1). 

Since all irreducible subquotients of V = J[p|*(Q,)m are ramified, we 
have 

yey 

To check the reverse inclusion, we take the Q,-points of (C) and localize at 
Mt: 

(1) From the fact that B[plm is ordinary, we see that the map induced 

by @ is an isomorphism. 
(2) By finiteness and flatness, we check that a induces an isomorphism 


simply by comparing the ranks; it is enough to to do that on the 
geometric special fiber; there it is obvious since A /E, is a split torus. 


Hence V8 = V. 


Step 4. We pass to the tangent spaces and tangent maps in the first row 
of (C). Again from [Ma2], Proposition 1.3, we have a diagram with exact 
rows: 


Dee Alpe a il, eo a ee 
©) | a | 
hs Casi, == tie = foe 0 


Let us localize at 9. 
(1) ag is an isomorphism because A /k, is a torus. 
(2) Son is an isomorphism because B[p|sn is ordinary, so jo is an iso- 
morphism. 
At this final stage, we make use again of the weak multiplicity one 
theorem in characteristic p to prove: 
(+x) (tap, om = R 


(as R-modules). 
By Nakayama’s lemma, one must check 


dim, (t zp, /IN ; tz/E,) =1. 


Since X,(N’,p) has ordinary double points we find that Q is the sheaf of 
regular differentials on this curve, there is an F,-duality: 


tap, /M . to/E, — H°(X\(N’, p)p, , 2) [DN]. 


The right-hand side is one-dimensional by Mazur’s argument. 
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To conclude that (**) holds we actually go up to Z, and notice that 


R - (t3/z,)n 


is an isomorphism when tensored with Q,, hence it is injective. 
We have thus proven that 


trpy = R@x, Fp. 


We observe then that for any finite flat multiplicative group scheme there 
is a functorial isomorphism 


tGe, = G(Q,) ®@ F, 


(see (2.9), page 488 of [W4]). 
So we see that 


II 


V@F, = Re@FE, 
as R-modules. 
We leave it as an exercise of Galois descent that this implies that V is 


free over R. 


Appendix by B. Mazur (letter to K. Ribet and to the author) 
Gorenstein-ness and the multiplicity one theorem in Sections 
II.15 and IT.16 of [M1] 


The best manner to correct the erroneous proof of Lemma IT.16.3 of [M1] 
(which simply refers the reader to Lemma IT.15.1) is to give a more compre- 
hensive proof of Lemma II.15.1. This new version of the lemma shows that 
Corollary I1.16.2 implies Corollary I1.16.3. Actually, in Lemma IT.15.1 one 
shouldn’t make any distinction between “Eisenstein maximal ideal” and 
any “ordinary, good reduction” maximal ideal. It should be stated in the 
following degree of generality: 

Let T be any finite flat (commutative) Z,-algebra such that T @ Q, is 
etale (or just Gorenstein!) and J any p-divisible group (over Spec Z,, say) 
which is “ordinary” in the sense that it is an extension of an etale p-divisible 
group J* by a multiplicative type p-divisible group J™. Suppose that T 
is a ring of endomorphisms of the p-divisible group J over SpecZ,. Let * 
denote Pontrjagin dual. 

Make these further assumptions: 

(1) Je" is a free T-module (rank d > 0, say). 

(2) J is self-dual as a p-divisible group (“Cartier” self-dual) and the 
action of T is “Hermitian” with respect to this self-duality, “Her- 
mitian” meaning of course “self-adjoint.” 
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That’s it, I believe, for assumptions. Then these things are equivalent: 


(i) J°*[9N] is of dimension d over the residue field. 
(ii) J[SN] is of dimension 2d over the residue field. 
(iii) J™” is free of rank d over T. 

(iv) J* is free of rank 2d over T. 

(v) T is Gorenstein. 


Proof. Since the self-duality produces a duality between etale and multi- 
plicative type parts we get that the T-modules J*** and J™** are Z,-duals. 
Since J*** @Q, is free of rank d over T@Q,, we get (by Gorenstein-ness 
of T@Q,) that J™” @ Q, is also free of rank d over T @ Q,. 

Now, I guess we can begin to tote up the equivalences. In dealing with 
property (iv) it will be useful to remember that T is Gorenstein if and only 
if the quotient T-module 


Hom(T, Z,)/MHom(T, Z,) 


is of dimension one over the residue field. 

I claim that (i) is equivalent to (iii) because J[9] is the Pontrjagin dual 
of J*/INJ* . Also, (i) implies (iii), for by Nakayama, given (i), we can find 
a surjective T-homomorphism from T¢ to J™* which must have trivial 
kernel, as can be seen by counting the ranks over Q, of these modules 
tensored with Q,, since otherwise J™” @Q, could not be free of rank d 
over T ®Q,, which it is. By the same argument, (iv) implies (ii). By what 
we “remembered” above, counting dimensions over the residue field, we see 
that (v) is equivalent to (i) and to (ii), noting that J™” is the Z,-dual of 
a free T-module of rank d, and noting that J* is self-Z,-dual. 
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CRITERIA FOR COMPLETE INTERSECTIONS 


BART DE SMIT, KARL RUBIN, AND RENE SCHOOF 


Introduction 


In this paper we discuss two results in commutative algebra that are used in 
A. Wiles’s proof that all semi-stable elliptic curves over Q are modular [11]. 
We first fix some notation that is used throughout this paper. Let O 
be a complete Noetherian local ring with maximal ideal mo and residue 
field k = O/mo. Suppose that we have a commutative triangle of surjective 
homomorphisms of complete Noetherian local O-algebras: 


R. 285. 3F 
mrYy Lf tT 


O. 


Assume that T is a finite flat O-algebra, i.e., that T’ is finitely generated 
and free as an O-module. In the applications in Wiles’s proof O is a discrete 
valuation ring, R is a deformation ring, T is a Hecke algebra and mr is the 
homomorphism associated to a certain eigenform. 

We show two distinct criteria, formulated as Criterion I and Crite 
rion IT below, which give sufficient conditions to conclude that y is an 
isomorphism and that R and T are complete intersections. We say that 
a local O-algebra that is finitely generated as an O-module is a complete 
intersection over O if it is of the form 


Of Xa geeky Mell? Gayaees ta); with fi,..., fn € O[[X%1,...,Xnl]. 


We first state Criterion I. We put Ip = kermpr and I7 = keray. The 
congruence ideal of T is defined to be the O-ideal nr = mr Anny(I7). 


Criterion I. Suppose that O is a complete discrete valuation ring and 
that nr #4 0. Then 


lengtho(Ip/Iz) = lengthe(O/nr). 


Moreover, equality holds if and only if y is an isomorphism between com- 
plete intersections over O. 


Wiles used a slightly weaker form of this criterion, where T is assumed 
to be Gorenstein, to show that certain “non-minimal” deformation rings 
are isomorphic to Hecke algebras [7]. The present version, without the 
Gorenstein condition, is due to H.W. Lenstra [5]. In Section 3 we give 
an alternative argument for Criterion I that was found by the first and 
the third author. Criterion I is an easy consequence of the following result, 
which holds without any conditions on O or nr. 
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Theorem. The map y is an isomorphism between complete intersections 
over O if and only if pFitr(Ir) ¢ moT. 


Here Fitr(Ip) denotes the R-Fitting ideal of Ip. Fitting ideals are instru- 
mental in the proof of Criterion I. We recall their definition and basic 
properties in Section 1. 

A crucial special case of the theorem can already be found in a 1969 
paper of H. Wiebe [10]; see also [1, Thm.2.3.16]. More precisely, Wiebe’s 
result covers the case that O = k is a field, and ¢ is the identity on R= T. 
The statement is then that T is a complete intersection over k if and only 
if the Fitting ideal of its maximal ideal is non-zero. 

For the proof of Criterion 1 we need some properties of complete inter- 
sections that go back to J.T. Tate [8]. In Section 2 we formulate Tate’s result 
and prove it using Koszul complexes. These are discussed in Section 1. Asa 
consequence we find that complete intersections have the Gorenstein prop- 
erty. The Gorenstein property does not occur in our proof of Criterion I, 
but we briefly discuss its significance in our context at the end of Section 2. 


In order to formulate Criterion II, assume that char(k) = p > 0, and 
let n > 1. The ring O[[S),..., Sn] is filtered by the ideals J,, with m >0 
given by Jm = (wm(S1),---,Wm(Sn)), where wm(S) denotes the polynomial 
(1+ 5S)?" —1. Note that Jo = (S1,---, Sn). 


Criterion II. Suppose that for every m > 0 there is a commutative dia- 
gram of O-algebras 


OllSicayiSall a Ry a 


Ey 4 


RK T 
with the properties: 
(i) there is a surjection of O-algebras O[[X1,...,Xn]] —> Rm; 
(ii) the map ym: Rm —> Tm is surjective; 
(ili) the vertical arrows induce isomorphisms 


Rm/JoRm—>R and = Tm/JoTm—>T. 


(iv) the quotient ring Tm/JmTm is finite flat over O[[S1,..., Sn]]/Jm; 
Then vy: R —> T is an isomorphism between complete intersections over O. 


Criterion II, with the additional condition that k be a finite field, first 
appeared in the paper by R. Taylor and A. Wiles [9] with an improvement 
due to G. Faltings. It is used by Wiles for the “minimal” deformation 
problem [2]. In section 4 we present a proof due to the second author. 
It does not depend on the previous sections of the paper. Our approach 
avoids the original non-canonical limiting process and works for arbitrary 
complete Noetherian local rings O. 
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1. Preliminaries. 


In this section we first recall the definition and basic properties of Fitting 
ideals. Then we do the same for Koszul complexes following [3]. For more 
details see [4, Sections XIX.2, XX1.4]. 


Fitting ideals. Let A be a ring and let M be a finitely generated A- 
module with generators mj,...,Mn,- Let f: A” —» M be the surjective 
A-homomorphism defined by f(e;) = m; for i = 1,...,n. Here e; de 
notes the ith standard basis vector of A”. The Fitting ideal Fit 4(M) of M 
is the ideal generated by det(v1,...,Un) with vj,.-.,un € ker f. Clearly, 
Fit4(M) is already generated by the elements det(vi,...,un) where the 
vectors U1,...,Un Tange over-a fixed set of A-module generators for ker f. 

The Fitting ideal does not depend on the choice of the generators mj. 
To see this, let mpyj = on a;m,; with a; € A be an additional gener- 
ator of M. The kernel of the surjective homomorphism 7: A"t! —» M 
given by w(e;) = m; for 7 = 1,...,m,m+1, is generated by the vector 


(a1,--.-,Q@a, —1) and vectors (v;,0) where the v; range over a set of genera- 
tors for ker f. It follows at once that the Fitting ideal does not change when 
we replace the generators ™,...,™Mn by ™j,...,%n,Mn41. Inductively, 
this implies that any two generating sets m1,...,Mn and m{,...,mi, give 
rise to the same Fitting ideal as their union mj,...,Mn,m4,..., my. 


The following proposition contains the properties of the Fitting ideal 
that we will use. 


Proposition 1.1. Let A be a ring and let M be a finitely generated A- 
module. Then 
(i) we have Fit,4(M) C Anna(M); 
(ii) for any A-algebra B we have Fitg(M @, B) = Fit,(M) - B; 
(iii) for any ideal a C A we have Fit 4(A/a) = a; 
(iv) for every A-module N we have Fita(M x N) = Fita(M)Fit,(N). 


Proof. We sketch the proof. If v,,...,un are in the kernel of A” t.M ‘ 
then the matrix o with columns v),...,Un has the property that the com- 
posite map A” > A” 1. M is equal to zero. By multiplying first with the 
adjoint matrix of 7, we see that det(a)- A” C ker f. Since f is surjective, 
this implies that det(o) € Ann,4(M), and (i) follows. Part (ii) follows from 
the fact that taking the tensor product with B is right exact. Part (iii) is 
immediate from the definition if we take n = 1. We leave part (iv) to the 
reader. 


If A is a principal ideal domain, then, by the theory of elementary 
divisors, every finitely generated A-module M is of the form M 2 A/a, x 
...x A/a, for certain ideals a; C A. By (iii) and (iv), we see that Fit4(M) = 
a, ---a,. If A is a discrete valuation ring with maximal ideal my, then we 
see that 

Fit 4(M) = mignethal™) 
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with the convention that mf? = 0. 


Example. Let A = O[[XM,...,Xn]]/J with J = (fi,-..,f-) an ideal 
contained in I = (Xj,...,Xn), and put I, = I/J. Suppose that 9;; € 
O[[X1,.- wee satisfy 


Tr 
f= > 95%; if ye a eee 
g=1 


Then the Fitting ideal Fit,(I4) contains the determinants of the n x n 
submatrices of the matrix (g;;) modulo J. Actually, one can show that 
these determinants generate Fit,(I4) by applying Proposition 1.3 to the 
sequence X),...,Xp in O[[X1,...,Xn]]. This will not be used in the sequel. 


Koszul complexes. Let A be a ring, let V = A” and let f = (fi,.--, fr) € 
V. For any A-module M we set 


K(f, M) = Homa(Anv, MY), for m > 0; 


and for p € Kn(f,M) we define dp € Km_i(f,M) by dp(x) = g(f Az). 
Since d? = 0, we obtain a complex K.(f,M), which we call the Koszul 
complex of f on M: 
K.(f,M): 

0) RG MY > na IG se RG eS 


Note that A.(f,M) = K.(f,A) @a M and that K,,(f,A) is a free A- 
module of rank (”). The m-th homology group of K.(f, M/) is denoted by 
Hm(f,M). 

We have Ho(f,M) = M/IM, where I is the A-ideal generated by 
the fi- 


Lemma 1.2. The homology groups H»(f,M) are annihilated by I. 


Proof. Let y € K,,(f,M) with dy = 0. For each generator f; of I we 
must show that there is W € Kmii(f,M) with dy = fi. To see this, write 
V = Ae, x V’ where e; is the ith standard basis vector of V over A, and V’ 
is generated by the other standard basis vectors. Then every z € oa V 
can be written as z = e;Ax’ +2" for unique 2! € AV! and 2” € A™* VV". 
Now define w € Kiy4i(f,M) by ¥(z) = v(z’). From dy = 0 one deduces 
that dw = fip, as required. 

We say that a sequence of elements p),...,pn in A is M-regular, if for 


i= 1,...,n the multiplication by p; on M/(pi,...,pi—1)M is an injective 
map. The following proposition can also be found in [1, Thm.1.6.16]. 


Proposition 1.3. Let f = (f1,..-,fn) € A” and let M be an A-module. 
If the A-ideal I generated by fi,..., fn contains an M-regular sequence of 
length n, then H;(f,M)=0 fori> 1. 
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Proof. Let pi,...,pn € A be an M-zegular sequence in I. For any integer 
j with 0 < 7 < n we prove inductively that H;(f,M/(pi,-...,p;)M) = 0 
for alla > 7 +1. For j = n this is trivial, and for 7 = 0 this is the content 
of the proposition. 

Assume that this statement is true for some 7 > 0, and let M’ = 
M/(pi,---,Pj-1)M. Since the sequence p1,...,pn is M-regular, there is an 
exact sequence 


0 — M’ 2, M’ —+ M'/p;M'’ — 0. 


For each m we apply the exact functor Homa(A\™ V,—). This gives us a 
short exact sequence of complexes 


0 — K.(f,M’) => K.(f,M’) — K.(f,M'/p;M') — 0. 


By Lemma 1.2 the homology groups of K.(f,M’) are annihilated by I and 
therefore by p;. This implies that the long exact homology sequence breaks 
up into short exact sequences. For every 7 = 1,...,n we obtain an exact 
sequence 


0 — Hi(f,M’) — Hilf,M'/pjM') — Hia(f,M') — 0. 


The induction hypothesis implies that the middle group is zero for i > 7+1. 
This implies that H;(f,M’) = 0 for i > j, which is the statement for j — 1. 


2. Complete intersections. 


This section is devoted to the proof of the following result, which goes back 
to Tate [8]. 


Proposition 2.1. Let O be a complete Noetherian local ring. Let A be 
a finite flat O-algebra of the form A = O[|[X,,...,Xn]]/(fi,-.--, fn) with 
(fis---yfn) C (M1,--.,Xn). Write fi = 0%, 91;Xj, let d be the image of 
det(9,;) in A, and let I, be the A-ideal I4 = (Xi,...,Xn)/(fi,---, fn): 
Then we have 

(i) Fita(Ia) = Annya(I,) = (d); 

(ii) the A-ideal (d) is a direct O-summand of A of O-rank 1. 


Proof. Let P = O[[Xj,...,Xn]]. We write f for the vector (fi,...,f-) € 
P”. Multiplication with the matrix g;; gives an P-linear map P” —>+ P” 
sending the vector X = (X1,...,X,) to f. It induces a morphism of Koszul 
complexes 


K.(f,P) — K.(X,P). 


The sequence Xj,...,X,y is P-regular. Since A is finitely generated as an 
O-module, there is for every i a monic polynomial p;(X;) € (fi,--., fr). 
The sequence pi,..., Dn is O[X),...,Xn|-regular and by exactness of com- 
pletion it is also P-regular. By Prop.1.3 the homology groups of both Koszul 
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complexes vanish and we obtain the following commutative diagram with 
exact rows 


(cena ipe PURE” ppt ask iecg cack pm EA) pa A kth 
det(o.,) | | 03 | | |x 
Oct» p (oan Xn) PR 2a!) Sep. Ape (Ar yes Xn) PSS) =f). 


Here 7,4 is the O-algebra map A —> O with kernel I4. We now tensor 
the whole diagram on the right with the P-module A. Since the rows are 
P-free resolutions of A and O, the homology groups of the rows become 
Tor#(A,.A) and Tor#(O,.A) respectively. Hence, we obtain a.commutative 
diagram with exact rows: 


0 —+ Tor?(A,A) — A 4 A” 


er 


0. =. Ter (OAy as a ee) aa 


It follows that Tor®(O,A) & Anna(I,). In order to determine this Tor- 
group and the image of 7,4,, we tensor the P-resolution K.(f,P) of A on 
the left with the P-module map s: A —> O. This gives a map between two 
complexes with homology groups Tor# (A, A) and Tor# (O, A) respectively. 
Since one can compute Tor-functors ice resolutions of either argument 


[4, Chap. XX, Prop. 8.2’], the same map 74, then makes the following 
diagram with exact rows commute: 


0: 3 Tor? (AA) =» A 2s a® 


| ie 


Or 3 “Werk (OC. Ay Se. 0.2 207, 


In particular we see that 74, is surjective, so that (d) = Ann,(I,) and (d) 
is free of rank 1 as an O-module. On the other hand, 


(d) Cc Fita(Z4) Cc Anna(Ja), 


and therefore we have equality everywhere. By applying what we have 
already proved to the complete intersection A @g k over k we see that 
d@1#0in A@ok, so that d ¢ moA. By Nakayama’s lemma we can 
therefore make the element d part of an O-basis of A, so that the inclusion 
(d) C A splits as an O-linear map. This proves the proposition. 


Corollary 2.2. If in the situation of Proposition 2.1 the ring O is a field, 
then (d) is the unique minimal non-zero ideal of A. 
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Proof. Proposition 2.1 says that (d) has dimension 1 over O = k, so (d) 
contains no smaller non-zero ideals. On the other hand, every minimal ideal 
a is annihilated by the maximal ideal I, of A, and by Proposition 2.1 we 
have Anng(I,) = (d), soa C (d). 


Corollary 2.3. Let A be a finite flat O-algebra with a section my, : 
A — O and let I4 = keray. If A is a complete intersection over O, 
then Fit4(Ia) = Anna(Ja), and this ideal is a non-zero direct O-summand 
of A. 


Proof. Suppose A = O[[X,..., Xn]]/(fi,--., fn). Since O is complete, 
a linear change of variables that replaces X; by X; — 7,4(X;) gives that 
(fis---> fn) C(%,.--,Xn). The result now follows from Proposition 2.1. 


We conclude this section with some remarks that will not be used in the 
rest of this paper. 


The Gorenstein condition. Let A be a finite flat O-algebra. Then the 
O-linear dual AY = Home(A, QO) of A has an A-module structure given 
by (af)(xz) = f(ax) for f € AY and a,z € A. The algebra A is called 
Gorenstein over O if AY is a free A-module of rank 1. 

It follows from Proposition 2.1 (ii) that for A of the form 


Ol Miresog Mall Figaves ted 


with (f1,.--,fn) C (%1,---,Xn), there exists an O-linear map t: A —+ O 
with t(d) = 1. This homomorphism t generates AY as an A-module, so 
that A is Gorenstein over O. To see this when O is a field, one notes that 
(d) ¢ Anna(t), so that Ann,(t) = 0 by Corollary 2.2. With Nakayama’s 
lemma, the general case then follows as well. 

In general, suppose that A is Gorenstein, so there is an A-module 
isomorphism s: AY —+> A. Assume in addition that there exists a section 
wa: A —> O, and put I, = kerz,. Then the image of the composite map 


O = OY “4 AY 5 A 
is Ann,4(I,4). To see this, one notes that the image of 7 is 


O-nma={f € AY: f(Is) = 0}, 
and that 
f(Ta) =) <a fas f=0 =— s(f) €E Anna(Ja). 
Applying 7, we see that the congruence ideal 74 = 74 Anna(Ja) is equal 


to the O-ideal generated by 74050 7%(1). It is this property that Wiles 
uses to define the congruence ideal in the Gorenstein case. 


More general complete intersections. The statement that finite com- 
plete intersection algebras are Gorenstein holds over much more general 
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base rings, and it also holds if there is no section A —+ O. Moreover, one 
can omit the flatness condition on A in Proposition 2.1, because it follows 
from the other assumptions. More precisely, if O is any ring and the ring 
A=O[X,...,Xn]/(fi,---, fn) is finitely generated as an O-module, then 
one can show with Koszul complexes that A is projective as an O-module 
[3]. An argument of Tate [6, appendix] then implies that AV is free of 
rank 1 over A. For Noetherian O the class of finite O-algebras of the form 
O[[X1,--.,Xn]]/(f1,---,fn) is a subclass of the class of finite algebras of 
the form O[X1,...,Xn]/(fi,---, fn); see [3]. In particular, these algebras 
are also projective and Gorenstein over O. 


3. Proof of Criterion I 


In this section we first prove the theorem in the introduction and then 
show Criterion I. Using Nakayama’s lemma we first show that the question 
whether ~ is an isomorphism reduces to the case that O is a field. 


Lemma 3.1. Let f: A —> B be a surjective homomorphism of Noethe- 
rian local O-algebras for which B is finite fat over O. Suppose that the 
induced map f: A@o k —+ B @o k is an isomorphism. Then f is an 
isomorphism. 


Proof. By applying Nakayama’s lemma to B as an O-module we see that f 
is surjective. Since B is O-free, (ker f) @ok is the kernel of f, which is zero. 
The ring A is Noetherian, so kerf is finitely generated as an A-module. 
Since mg is contained in the maximal ideal of A we can apply Nakayama’s 
lemma to the A-module ker f and conclude that ker f = 0. 


Now we give the proof of the theorem stated in the introduction. Re- 
call that we have a commutative triangle of surjective homomorphisms of 
complete Noetherian local O-algebras with T finite and flat over O: 


Ro Sec OF 
mr NY £ tT 
O. 


We let Ip = kerap and Ip = kerar. 


Theorem. The map ¢ is an isomorphism between complete intersections 
over O if and only if pFitr([r) Z moT. 


Proof. In order to show “only if’, we note that by Corollary 2.3, Fitr([r) 
is a non-zero direct O-summand of T and in particular yFitr([r) = 
Fitr([7) ¢ mol. 

To show “if”, suppose first that O = k is a field. Since R is com- 
plete and Noetherian, we can write R = k[[Xq,...,Xn]]/Jr where Jp is 
a k[[Xi,...,Xp]]-ideal. Since T is a finite dimensional k-vector space, we 
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can do this in such a way that the elements y(X;mod Jr) generate Ip as 
a k-vector space. The kernel Jr of the composite map 


k([X1,...,Xn]] —» R->T 


is contained in the ideal J = (X1,...,X,n). We assume that yFitr(Ip) # 
0, which means that there are polynomials g;; € k[[Xi,...,Xn]] so that 
yy 95%; © Jp fori=1,...,n and det(g;) ¢ Jr. 

Since the elements X; generate I/ Jp as a k-vector space, the mono- 
mials X;X, generate I 2/I Jp as a k-vector space. This implies that every 
element of the quotient ring k[[Xi,...,X,]|/I Jr is represented by a poly- 
nomial of total degree at most 2. Therefore, we can, for 7 = 1,...,n, find 
polynomials p; and q; of total degree at most 2, so that 


R= S > 95%; (mod I Jr), 
j 
q; = X? (mod I Jr). 


We now let the polynomials f1,..., fn be 


fi= X23 -—Qet+pi 1a ae Heol Ce 2 
Note first that f; € I[J7- + Jr Cc Jp and that f; = y; GijX; with G; = 
9:3 mod Jr. 

The k-algebra B = k[X1,...,Xn]/(fi,---, fn) has finite dimension as 
a k-vector space, because every element in B is represented by a poly- 
nomial of degree at most 2 in each variable. Therefore, B is Artinian 
and it is a finite product of local Artinian rings. Hence, the completion 
B=k([X,...,Xnl]/(f,.--,fa) of Bat (X1,..., Xp) is a factor of B, so 
it is also finite dimensional over k. By Corollary 2.2 the B-ideal generated 
by det(G,,) is the unique minimal non-zero ideal of B. Since det(G;;) = 
det(gi;) #0 (mod Jr), this minimal ideal does not map to 0 in T. It follows 
that the map B —> T is an isomorphism. Thus, T is a complete intersec- 
tion over k, and Jr = (fi,---, fr) ClJr + Jr. By Nakayama’s lemma we 
must have Jr = Jp so that ¢ is an isomorphism. This completes the proof 
in the case that O = k. 

We now prove the “if” part for general O. The map 7p: R — O is 
an O-split surjection, so the induced map R@ok — k has kernel IR @o k. 
Since Fit,(Ip @o k) is the image in R @q@ k of Fitr(Jp), the case that we 
proved already implies that the map R@o k —> T@ok is an isomorphism 
between complete intersections over k. Lemma 3.1 implies that y is an 
isomorphism. Moreover, we can lift any k-algebra isomorphism 


k[[X1,--., Xn]]/(f,---; fn) > T @o k. 
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to a surjective O-algebra homomorphism w: O[[Xi,.-.,Xn]] —> T. The 
kernel of w contains lifts f; of the elements f;, and by Lemma 3.1 the 
induced map 


O[[X1,---,Xn]/(fi,---, fn) —» T. 
is an isomorphism. This proves the theorem. 


Proof of Criterion I. First we show the inequality. By Prop.1.1 (i) we 
have Fitr(Ip) C Annr(Ip). Since the map Ir I> is surjective, we have 
gy Annr(Ir) Cc Anny(Ir). Hence we see that 


trFit([r) = 7rpFitr(Iz) C mr Annp(Er) = nr = msenolO/™), 


Viewing O as an R-algebra via 7a: R —+ O we have Ip @p O = Ip/I2. 
By Prop.1.1 (ii) this implies that 


TR Fitr(Ir) = Fito (Ip/I?) = micnethoUn/Tn), 


and it follows that lengthe(Ip/I}%) > lengthe(O/nr). Moreover, if — is 
an isomorphism between complete intersections, then by Corollary 2.3 we 
have gy Fitr(Ir) = Annr(Ir), and therefore the two lengths are equal. 

To show the converse, assume that the two lengths are equal, i.e., we 
have that rp Fite([r) = mr Anny(I7). We first show that I7NAnnr(Ir) = 
0. Since ny # O there is an element y € Anny(I7) for which mr(y) # 
0. For any element s € Ip Annr(I7) we clearly have sy = 0 and 
z(y —7r(y)) = 0. But then mr(y)x = 0, and since T is free as a mod- 
ule over the discrete valuation ring O this implies that z = 0. This shows 
that Ip N Anny (Ir) =): 

It follows that the map mr: Anny(I7) —> nr is an isomorphism. Since 


wry Fitr(Ir) =r Fitr([r) = tr Annz(I7), 


we conclude that yFitr(Ir) = Annr(I7). This non-zero O-submodule of 
T cannot be contained in moT because T/ Anny(I7) injects canonically to 
Endo(Ir), which is torsion free as an O-module. By the theorem this can 
only happen if ~ is an isomorphism of complete intersections. This proves 
Criterion I. 


Remark. If T is Gorenstein over O (see the end of Section 2), or if O isa 
complete discrete valuation ring, then it is not hard to show that Anny(I7) 
is a non-zero direct O-summand of T. By Corollary 2.3 the condition 
pFitr([r) ¢ moT in the theorem can then be replaced by yFitr(Ir) = 
Anny (I). This may fail for other rings O. For instance, let & be a field, and 
let O = kle] with e? = 0. The ring T = O[[X, Y]]/(X?, Y?, XY -eX —cY), 
with Ir = (X,Y), is a finite flat O-algebra with Fitr([7) = Annr(I7) = 
(eX,cY), but T is not a complete intersection over O. 
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4. Proof of Criterion II 


In this section we prove Criterion II. Just as in Section 3, we first give the 
argument over a field, and then apply Nakayama’s lemma. 


Lemma 4.1. Let k be a field and let n > 1. Suppose we have k-algebra 
homomorphisms 


k{[S1,.-.,Sal] >? k[[Xi,.-.,Xn]] 2 A 


with f surjective, and suppose that the k-algebra A/(S1,...,S,)A has finite 
dimension d as a vector space over k. Assume that for some N > n™—1!d", 
the induced map 


HISi 2 SelM iach), AS: ALP tS VA 
is injective. Then f induces an isomorphism of k-algebras 
kl[X1,.--,Xn]]/(S1,---,5n) —> A/(Si,--.,5_p)A. 


Proof. The ring k[[X1,..., Xn]] is a local ring with maximal ideal I = 
(Xj,...,Xn). Since A/(Sj,...,5,)A has length d as a module over the 
ring k[[X1,...,Xnl], it is annihilated by I¢. Writing J = ker f, this means 
that 

FONE IAS iyo ce one 


where (51,...,5,) denotes the ideal of k[[X1,...,Xn]] generated by the 
S;. We will show that J Cc I¢+! by assuming that we can find a € J with 
a ¢ I¢+!, and deriving a contradiction. Consider the multiplication by a 
map: 


0 ker Xe le 
, k[[Xy,...,Xn]]/I72% —> cok — 0. 
Since k[[X,...,Xn]]/I"¢™ has finite dimension over k, it follows that 


dim, (ker) = dim,(cok). We give estimates for these two dimensions. We 
have inclusions of k[[X1,..., Xn]|-ideals 


TON? PAS adiatalye Cd aS eee 


so the cokernel cok = k[[X1,..., Xn]]/(I"** +(a)) now maps surjectively to 
the quotient ring k[[X1,...,Xnl]]/(J+(SP,...,5%)) = A/(SN,..., SE)A. 
Since g is injective this gives , 


dim, A/(S,...,S%)A 
dim, k[[Si,..., Sn]|/(S",...,S%) = N™. 


Th 


dim; cok > 
= 
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On the other hand, since a ¢ I?+!, we have ker C ["4N-4/]72N oy 
that the dim,(ker) is at most the number of monomials of degree 6 with 
ndN —d<6<ndN. Therefore 


ndN—1 Sa eet 

dim, ker < Ss; ( FG ) < d(ndN)"~?. 
d=ndN—d 

Combining the two estimates we see that N” < d(ndN)”~!, which contra- 


dicts the assumption that N > n™~!d". This proves that J c [4+!. 
To finish the proof of the lemma, consider the inclusions 


TP CC ACG he taGey ie EO e aS coh 
By Nakayama’s lemma we see that I? C ($),...,5n), so that 
ker fe 2 CIC (Si5 5 Sab 
Since f induces an isomorphism k[[X),...,Xn]]|/J —>+ A, the lemma fol- 


lows. 


We now return to the setting in which Criterion II is formulated: we let 
O be a complete Noetherian local ring and suppose that its residue field 
k has characteristic p > 0. Let n > 1 and for m > O let Jm be the 
O[[S1,---, Sn]|-ideal (wm(S1),..-,Wm(Sn)), where wm(S) denotes the poly- 
nomial (1+ S$)?" —1. 


Corollary 4.2. Suppose we have O-algebra homomorphisms 
OS cts S all Ol ie Ke A 


with f surjective, and A/(Si,...,S,)A free of rank d > 0 over O. If, for 
some m with p™ > n™—1d” the quotient ring A/JmA is free as a module 
over O[[S1,..-,Sn]|/Jm, then the induced map 


hi Ola, sep Kall Sieg Sa) > ANGE 48a) A 


is an isomorphism between complete intersections over O. 


Proof. Taking everything modulo mg we see that for the k-algebra A = 
A®ok, the quotient ring A/ (oy asus SP” )A is a non-zero free module over 
K[[S1,---,Sn]|/(S? ,..-,S2”). By Lemma 4.1 we see that A is an isomor- 
phism modulo mg, and Lemma 3.1 then implies that A is an isomorphism. 
In particular we see that O[[X1,...,Xn]]/(S1,..-,Sn) is finitely generated 
as an O-module, so that it is a complete intersection. This shows 4.2. 


Proof of Criterion II. Let d denote the O-rank of T, and let m be so 
large that p™ > n”~1!d”. By property (i) there is a surjection 


OG cee Ke eR 
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We now lift the homomorphism O[[S),..., S,]] —+ Rm to an O-algebra ho- 
momorphisms O[[S},...,5n]] —* O[[XM1,...,Xn]] and we apply Corollary 
4.2 with A = T,,. We conclude that the composite map 


O[[X1,---,Xn]]/(S1,---, Sn) _> Td Sigsisg on a 
_—?) Fal igsee on an 


is an isomorphism between complete intersections. It follows from property 
(iii) that ~ is an isomorphism between complete intersections as well. 
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£-ADIC MODULAR DEFORMATIONS AND WILES’S 
“MAIN CONJECTURE” 


FRED DIAMOND AND KENNETH A. RIBET 


41. INTRODUCTION 


Let & be an elliptic curve over Q. The Shimura-Taniyama conjecture 
asserts that & is modular, i.e., that there is a weight-two newform f such 
that ap(f) = ap)(£) for all primes p at which E has good reduction. Let 
£ be a prime, choose a basis for the Tate module T;(£) and consider the 
é-adic representation 


pre: Gq Aut(T?(£)) = GL2(Ze). 


Then & is modular if and only if pz is modular, i.e., if and only if pz is 
equivalent over Qy to the representation p;. for some f (see [22]). 

We aim to prove a stronger result which characterizes @-adic represen- 
tations arising from modular forms. Since the coefficients of a newform lie 
in the ring of integers of a number field which is not necessarily Q, we are 
led to consider representations 


(1) p: Ga — GL,2(A) 


where A is the ring of integers of a finite extension of Q¢, and ask which 
of these arise from modular forms. In fact, it turns out to be convenient to 
consider representations p as in (1) where now A is a complete local Noe- 
therian ring with finite residue field k, and formulate a notion of modularity 
for such a representation. Roughly speaking, the main theorem of Wiles 
in [26] supposes we are given such a representation which is “plausibly 
modular” and concludes that under certain technical hypotheses: 


(2) p modular => p modular, 
where 9: Gq — GLa(k) is the reduction of p. 
Returning to the elliptic curve E, suppose we know that 
PE: Ge _s GL2(F 2) 
is modular (by Langlands-Tunnell [16], for example, if 2 = 3 and the rep- 
resentation is irreducible). A result of the form (2) then implies pz is 


modular (assuming it is “plausibly modular” and satisfies the technical 
hypotheses), hence so is EH! 
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To orient the reader, we recall the connection with Fermat’s Last Theo- 
rem, which begins with an idea of G. Frey (see [14]): Suppose that we have 
a non-trivial solution a” + 6” = c”, where n > 5 is prime, 0 is even and 

= —1 mod 4. Consider the elliptic curve 


B:Y? = X(X —a")(X +0"). 


The modularity of & implies in particular that the mod n representation 
Pen is modular. The main result of [21] (see [12]) then shows that prin 
arises from a (non-zero) modular form of weight two and level two, a con- 
tradiction as there are no such forms. 


2. STRATEGY 


We begin our formal discussion by returning to the phrase “plausibly 
modular,” which we used in connection with representations like (1). For 
simplicity, assume again that A is the ring of integers of a finite extension of 
Q,. Consider representations p : Gq — GL2(A) which arise from weight- 
two eigenforms on I'9(N), where N is allowed to vary. The idea is to exhibit 
a list of conditions satisfied by these representations which encapsulates 
their modularity. Those p which arise from weight-two forms on I'9(N) 
are irreducible with cyclotomic determinant. Further, they are unramified 
outside a finite set of prime numbers (namely, those not dividing £N). It is 
tempting to guess that any p satisfying these simple conditions is likely to 
be modular. In fact, however, one also needs to impose a further condition 
on the restriction of p to a decomposition group at @. A sufficient condition 
in this direction is conjectured by Fontaine and Mazur in [13], but with 
the current technology we need to impose a stronger one in order to obtain 
results. We will introduce a condition which is both convenient to work with 
and sufficient for applications to elliptic curves with semistable reduction 
at £. Namely, suppose now that the prime number @ is odd. Then we will 
assume that the representation p is “semistable” [2], ie., that it is either 
finite flat or ordinary at @ in the terminology of [19]. 

Let us fix an irreducible mod @ representation 


2D: Ga — GL2(k) 


whose determinant is the mod £ cyclotomic character. Denote by 72 the set 
of isomorphism classes of “plausibly modular” p as above with reduction 
p. We denote by T the set of (genuinely) modular isomorphism classes in 
R. Thus T is a set of modular forms giving rise to p, and the goal is to 
prove R = T. 

We suppose that J is not empty; we can paraphrase this condition by 
the statement that 9 is modular and semistable (locally at the prime @). It 
is known that T is infinite [20], so that R D T is infinite. This circumstance 
makes R rather unwieldy, so that we are led to filter R as follows: For each 
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finite set of primes ©, we will let Ry be the set of p in R which are of 
“type 4,” meaning they are well-behaved outside ©, and set Ty = TN Ry. 
Before making precise the notion of being “well-behaved,” we remark that 
R. will be the union of the Ry over all finite sets of primes ©. Therefore, 
to prove that R = T, it will suffice to prove 


(3) Ry = Tz 


for all ©. With the definition we give, each set Ry corresponds, at least 
conjecturally, to an easily described finite set of modular forms. 

We now give a preliminary definition of “type ©” in terms of the conduc- 
tors! N(p) and N(p): We say that p is type = if © contains the set of prime 
divisors of N(p)/N(p) (an integer by [1] or [18]). 

We shall assume below that N() is square-free, in which case this pre- 
liminary definition of type © turns out to be suitable (cf. remark 3.3), 
but we shall have to extend it (along with the notions of plausibly mod- 
ular and modular) to representations p as in (1) where A is a complete 
local Noetherian ring with finite residue field. The purpose is to work in 
the context of Mazur’s deformation theory [19], thereby introducing more 
structure into the problem and enabling us to use tools from commutative 
algebra. The desired equality (3) is subsumed by Wiles’s “Main Conjec- 
ture,” a precise version of (2) which takes the following form: A certain 
ring homomorphism 
(4) Ry > Ty 
is an isomorphism, where 

» Ry is a universal deformation ring which parametrizes representations 
of type = with a fixed residual representation /; 
® Ty is a Hecke algebra which parametrizes the newforms of weight 
two giving rise to such representations. 
Wiles’s strategy is to prove this first in the case © = 9, and then deduce the 


result for arbitrary &. The aim of this article is to explain the statement 
of the conjecture and the reduction to the case © = 9. 


3. THE “MAIN CONJECTURE” 


3.1. The Hecke algebra. Once again, we fix a representation 
p: Gq — GL2(k) 
where & is a finite field of characteristic £. We assume: 
(a) @ is odd; 
(b) p is irreducible; 
1Use the following ad hoc definition of the exponent of @ in the conductor of a 


representation which is semistable at £ and has cyclotomic determinant: it is trivial if 
the representation is finite flat and 1 otherwise. 
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(c) the restriction of 6 to a decomposition group at @ is finite flat or 
ordinary; 

(d) p has cyclotomic determinant; 

(e) p has square-free conductor. 


Remark 3.1. While hypotheses (a)~(c) are needed for the existing meth- 
ods, the last two are made to simplify the exposition. Their presence causes 
no problem in the application to Fermat’s Last Theorem, since they are sat- 
isfied by the mod @ representations coming from semistable elliptic curves. 
See [8] for a discussion of how to work without them—hypothesis (d) is 
not so serious; hypothesis (e) was removed in [7]. 


~For a reason fundamental to Wiles’s method, we must make the further 
assumption that p is modular. Roughly speaking, this means that # is 
equivalent to the reduction of a representation p;,, arising from a modular 
form. To make this assumption precise, let us fix embeddings Q — Q,, 
Q << C. We also fix an embedding of & in Fy, where F, is the residue 
field of the ring of integers of Q,. Suppose that f is a newform of weight 
two, level Ny and trivial character, and let Ky denote the number field 
generated by its coefficients a,(f). The chosen embeddings determine a 
prime of O;, the ring of integers of Ky. We write simply py for p;,y (see 
[22] where this representation is denoted p, and defined in the course of the 
proof of theorem 4). Thus p; is the absolutely irreducible representation 


Ge _ GL2(K;,y) 
characterized up to isomorphism by the following property: 


(5) If p is a prime not dividing 2N;, then ps is unramified at p, 
tro;(Frob,) = ap(f) and det p;(Frob,) = p. 


One can choose a basis so that the image of p; is contained in GL2(O;,), 
and the reduction 

ps? Ga > GL2(O;/A) 
is well-defined up to semi-simplification. 

We assume f is equivalent over Fz to pj; for some f as above. It turns 
out that if this assumption holds, then in fact there are infinitely many f 
to choose from. (As we mentioned above, this follows from the results of 
[20].) Given a finite set of primes ©, we can then ask which of these f 
give rise to representations of type &, in the sense that p; is semistable 
at £ and © contains the set of primes dividing N(p;,,)/N(p). A sufficient 
condition is that the level of f divide Ny where 


Ny = N(a) [[ v™ 
Pex 
and the m, are defined as follows: 
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Mp = 2 if p does not divide £N(p); 

Mp = 1ifp # £ and p divides N(f); 

mg = lif p is finite flat and ordinary at 2; 
mg = 0 otherwise. 


The motivation for this definition of Ny is that this condition is known to 
be necessary as well as sufficient, as long as we restrict our attention to 
forms f with trivial character and level not divisible by @?. (The proof of 
the necessity relies on the Deligne-Langlands-Carayol theorem, an analysis 
of possible values of N(p)/N(p) which is due independently to Carayol and 
Livné, and well-known results on the reduction of modular curves, abelian 
varieties and ¢-divisible groups. We shall not, however, make use of this 
fact; indeed it turns out to be a consequence of what follows.) 

Let ®y denote the set of newforms f of weight two, trivial character 
and level dividing Ny. As explained in [12], it follows from the results of 
[21] and others on Serre’s conjecture that the set ®g is non-empty. The 
analogous statement holds a fortiori for each ®y. We can then consider 
the ring 


Ts = Il O fir- 
FfESy 
Recall that for each f, the prime » of Ky is determined by our choices 
of embeddings and note that Ty is semilocal and finitely generated as 
a Ze-module. For each prime p not in &, we let T, denote the element 
(ap(f)) ;ea_- We define the Hecke algebra Ty as the Ze-subalgebra of Ty 
generated by the elements T, for p not in © U {£}. 

We can give another description of Ty in terms of the subring T of the 
ring of endomorphisms of S = So(I'o9(Nx)) generated by the operators T, 
for all primes p. We suppose that f is in ®y and we define fy as a certain 
T-eigenform in S for which f is the associated newform. The eigenform fz 
is characterized by this together with the properties: 

e ifpisind \ {2}, then ap(fx) = 0; 

@ if 2 divides Ny, then ag( fx) is an é-adic unit. 
The map sending J, to the reduction of ap(fs) defines a homomorphism 
T — Fy, and we write T,, for the completion of T at the kernel m of this 
homomorphism. We then have the lemma: 


Lemma 3.2. If @ is not in X, then the element Ty of Ty is in Ty. For 
arbitrary © there is an isomorphism Ty > Ty such that Tp ++ Tp for all 
p not ind. 


We refer the reader to section 2.3 of [26] or section 4.2 of [2] for the 
verification, which is tedious, unilluminating, and ultimately unnecessary 
(see the discussion in section 5 below). 
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3.2. The universal deformation ring. We appeal to Mazur’s deforma- 
tion theory of Galois representations, discussed in [4] and [19], to define a 
certain universal deformation ring Ry. 

Keep the technical hypotheses imposed on f at the beginning of the 
preceding section, and consider deformations of p of the form 


p: Gg — GL2(A), 


where A is a complete local Noetherian ring with residue field k. We say 
such a deformation is of type » if the following statements are true: 


(a) det p is cyclotomic. 

(b) p is finite flat or ordinary at 2. 

(c) let p be a prime not in %. Then: 
@ ifp¥ é and pis unramified at p, then so is p; 
9 ifp + £ and p is of type A at p, then so is p; 
@ if p= @ and fp is finite flat, then so is p. 


Remark 3.3. Suppose that A is the ring of integers of a finite extension 
of Q, and the first two conditions are satisfied. In that case, one can check 
that condition (c) is equivalent to the equality ord, N(p) = ord, N(f). We 
shall not use this fact. 


The results discussed in [4] and [19] furnish a universal deformation ring 
Ry and a universal deformation 


ps’ : Gq > GLa(Rz) 


of p of type &. 

Suppose now that p is modular. For a newform f in ®y, we let Af 
denote the subring of O;,, consisting of those elements whose reduction 
mod is in k. One checks that with respect to some basis, we have 


ps? Gq — Glo(As,), 


a deformation of p of type &. The universal property of Ry therefore 
furnishes a unique homomorphism 7;,5 : Ry — Ay such that the composite 


Go — GLo(Rz) — Gio(A;) 


is equivalent to p;. Since Ry is topologically generated by the traces of 
pm’ (Frob,) for p not in © U {2}, we conclude that the image of the map 
Rs —_ aie 
rot (7;7,3(7)) feos 


is precisely Ty. We define dy to be the resulting surjective ring homomor- 
phism Ry — Tr. 
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3.3. Statement of the conjecture. We keep the hypotheses on p im- 
posed in section 3.1 (the hypothesis of modularity as well as the technical 
conditions listed at the beginning of the section). We suppose that © is a 
finite set of primes and we consider the map dy defined at the end of the 
section 3.2. In our setting, Wiles’s conjecture 2.16 in [26] becomes: 


Conjecture 3.4. The map dy is an isomorphism. 


We briefly recall how the conjecture implies the Shimura-Taniyama Con- 
jecture for semistable elliptic curves (see also [3]). 


Theorem 3.5. Suppose conjecture 3.4 holds and E is an elliptic curve 
with square-free conductor Ng. If there is an odd prime £ such that pre is 
irreducible and modular, then E is modular. 


To prove this, one checks that under these hypotheses, the mod @ repre- 
sentation fz,¢ satisfies the technical conditions of §3.1. Moreover the é-adic 
representation pz is a deformation of pre of type & for some %; for ex- 
ample, one can take © to be the set of primes dividing Ngé. On taking 
O = Ze, we obtain from the universal property of Ry a homomorphism 
6: Rs — Ze where 


tr(p3""” (Frobp)) t+ p(B) = pt 1 — #E(Fp) 


for p ~ £ not in ©. Now the conjecture implies gy is an isomorphism, so 
we may consider the composite 


One sees from the definition of Ty that such a homomorphism is necessarily 
a projection 77» for some f in ®y. It follows that og¢ is isomorphic to 
pf, or equivalently that 


ap(f)=a,(E) forall p¢d. 


Therefore E is modular. 

If fz3 is irreducible, then the Langlands-Tunnell theorem [16] shows 
that pz3 is modular, hence conjecture 3.4 implies that E is modular. If 
6z,3 is reducible, then Wiles argues (see [23]) that pz,5 is isomorphic to 
Pes for some semistable E’ with irreducible fg 3. Since E’ is modular, 
SO IS PE’ 5 © Ppz5, 80 we may apply the preceding theorem with = 5. 


4. REDUCTION TO THE CASE } = @ 


4.1. Commutative algebra. Suppose now that f = }> ang” is a newform 
in ®g (recall from [12] that such an f exists), hence in ®y for every finite 
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set of primes ©. Consider the commutative triangle 


ie 23 
% Of 
Ay 


where the downwards arrow is 7s, and the diagonal one corresponds to 
pf Via the universal property of Rs. 

In order to apply the commutative algebra results explained in [5], it 
will be convenient to work with O-algebras where O = O;,. Note that dy 
is an isomorphism if and only if 


$3 Ow O : Bs @weyO - Ts Swe) O 


is an isomorphism. From now on, we replace ¢dy, Ry and Ty by their 
tensor products over W(k) with O. The resulting representation Gq > 
GL2(Rs) is universal for type-X deformations of p as in (1), where now A 
is a complete local Noetherian O-algebra with residue field O/A. Writing 
O, for the O-subalgebra of Q, generated by the Fourier coefficients of f, 
we may identify Ty with the O-subalgebra of 


/ 
Il % 
ges 


generated by the operators T, for p not in © U {f}. Our commutative 
triangle becomes 


Ry © Ts 
ss <2 
O 


Write 7x for the map Ty — O, let py denote the kernel of Ry — O, and 
let ny denote the ideal my(Anny,kermy) (which is non-zero). According 
to Criterion I of [5], we have 
(6) lengtho(ps/ps) > lengthe(O/ns) 
and the following are equivalent: 


® dy is an isomorphism between complete intersections over O. 
@ Equality holds in (6). 


It is actually the following strengthening of conjecture 3.4 whose proof 
we reduce to the case of © = 9. 


Conjecture 4.1. The surjection dy is an isomorphism between complete 
intersections over O. 


In the remaining sections we sketch the proof that if 


lengtho(Pz/py) = lengthe(O/ns), 
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then 
lengthe (px:/P5) < length (O/ns) 
where &’ = © U {p}. This implies: 


Theorem 4.2. If conjecture 4.1 holds in the case & = 9, then it holds for 
all &. 


The proof of conjecture 4.1 in the case © = @ is explained in [3]. 


4.2. Selmer groups. Recall from Chapter VI of [19] (see also §2.7 of [2]) 
that the O-module py/p} has a natural description in terms of Galois 
cohomology. Write M; for O? with Galois action defined by p; and E; 
for ad° Mf, the O[Ga]-module of trace-zero O-endomorphisms of M;. Let 
E;yn denote Ey @g A~"O/O and set 


Ej,00 = lim E's.n = Ey ea) kK/O ca Ey Qz, Q./Ze. 


According to §28 of [19] (or more precisely its analogue in the case of fixed 
determinant; cf. Proposition 3 of §26), we have a canonical isomorphism 


Hom(ps/py, K/O) = Hp(Ga.sute}, Ef,00) 
where D is the type-} deformation condition. Appealing to the descriptions 


in §§29-31 of the resulting conditions on local cohomology classes, we find 
that the latter O-module is the “generalized Selmer group” 


lim H7(Q, Es) 


in the notation of §6 of [25] (with the appropriate choice of local condition 
at £ from §7). 

The analogous statements hold with © replaced by ©’ (in which case 
we write D’ and CL’), so to compare the lengths of p/p} and py /p3, we 
consider the cokernel of the natural inclusion 


Hp(Ga,sue} Ef,0) — Ap (Gesue; Ef,00)- 


Suppose now that p does not divide £N(p). From the definitions of the 
generalized Selmer groups, we see that our cokernel embeds naturally in 


H! (Ip, Ep,00)?!”. 


Since the action of I, on Ey. is trivial, the cohomology group can be 
identified with 


Hom(Ip, Es,0o) = Hom(Ze(1), EF ,00) 


as a module for G,/Ip. We are therefore reduced to computing the length 
of - 


H® (Gp/Ip, Ef,o(—1))- 
This is just the kernel of the endomorphism 1 — Frob, 1 of 
E;(-1) ®o KO: 
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This kernel is finite if and only if 1 — Frob, ' defines an automorphism of 
the K-vector space H;(—1) @g K, in which case its O-length is simply the 
valuation of the determinant. We compute this determinant using the fact 
that the characteristic polynomial of Froby on My ®g K is X? —a,X +p 
(see (5)). The result is that our determinant is 


(7) (1 — p) ((1+p)* — a5) 
(non-zero by [22], theorem 5), so we conclude that 
(8) lengtho (Px /Py,) < lengthy (px/Pps) + va(cp) 


where Cp is given by (7). 
The cohomological calculation above is a special case of the following 
general result, whose proof is left as an exercise. 


Proposition 4.3. Suppose that p € 2 and X a finitely generated free O- 
module with a continuous action of Gp. Let Xx = X @o K/O. Then the 
Fitting ideal of the O-module H1(Ip, Xoo)??/!* is generated by the deter- 
minant of the endomorphism 1 — Frob, 'p of (X @o K)r,. 


Applying it in the case that p # @, but p divides N(p), we find that 
the space (E's ®¢ K)r, is one-dimensional, Froby acts by the inverse of the 
cyclotomic character, and (8) holds with cp = 1 — p’. 

Finally, in the case p = @, the groups H}, and H}, coincide unless f is 
ordinary and finite flat. In that case, bounding the cokernel is more subtle 
and one uses the calculations in [25] to show that the length increases by at 
most the valuation of 1—a%, where ag is the unit root of X? -agX +£=0. 

Summing up, we have 


Lemma 4.4. In all of the cases above, 


length (ps: /pe) < lengthe(ps/p}) + (Cp) 
where 


_f (l-p)((1+p)?- 42) if pi N(A) 
ss 1-7 otherwise. 


4.3. Congruence modules. Now we have to prove the inequality 


lengthe(O/nz) 2 lengthe(O/nz) + v(cp), 
or equivalently: 


(9) ns C cps, 
where Cp is defined above. Before sketching the proof, we describe the 
general strategy and consider an example. 

The first observation to make is that the ideal 7; measures congruences 
between f and other forms of level Ny. Suppose for simplicity that O 
contains the coefficients of all newforms in ®y. Then T's is contained in a 
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product of copies of O indexed by these newforms and 7 is the projection 
onto the coordinate corresponding to f. The ideal ny consists of those z 
such that 
(x,0,0,...,0) € Ts, 

where the first coordinate corresponds to f. If there are just two forms, 
f and g, in ®s, then ns is the ideal generated by ap(f) — ap(g) for p not 
dividing Ny. So ny measures how congruent are the coefficients of f and g 
at: “good” primes. More generally, one finds that ny C A” if f =g mod X” 
(in the sense above) for some g which is a linear combination over O of the 
forms different from f. 

Consider for example the unique newform f of weight 2 and level 11. 
Its Fourier coefficients are rational, and the associated L-function is that 
of (the isogeny class of) the elliptic curve & defined by 


Y°2 VY =X* =X. 
The 3-adic representation attached to f is equivalent to 
PE,3 : Gea — Aut(%3(£)) — GL2(Zs3), 


and we let @ denote the reduction. (We leave it to the reader to verify that 
p satisfies our running hypotheses.) Since ®g is the singleton set {f} (as 
No = 11), we have ng = O. Take © = {3}. Then Ny = 33, and ®» consists 
of f and the unique newform g of (weight 2, trivial character and) level 33. 
To prove this statement, one needs to check that f and g are congruent 
mod 3; for this, it suffices to compare the Galois action on the 3-division 
points of the corresponding elliptic curves. Therefore, ny = 30 is indeed 
generated by cg. 

Now consider the problem of comparing nx and ny = nz U {p} when 
p= 7. Then Nx = 1617 and cp is 9 times a unit. So we could verify 
this case of the desired inclusion by finding congruent newforms among the 
levels 77, 231, 539 and 1617, and then writing down a linear combination 
of g with these forms which is congruent to f mod 270. 

This type of problem (“raising the level”) was first addressed in [20], 
where the general strategy is as follows: Rather than produce such congru- 
ences directly, one detects them using the cohomology of modular curves”, 
or in our case, their Jacobians. The problem of comparing these “coho- 
mological congruences” at different levels is then reduced to studying the 
kernel of a certain homomorphism of Jacobians induced by degeneracy 
maps on the curves. This last issue is then resolved by a result of Ihara. 

The method and results of [20] were sharpened and generalized in various 
ways in such articles as [6], {11], [26] and [7]. We now sketch Wiles’s proof 
((26], section 2.2) of (9), with some modifications taken from [2]. 


2This approach is suggested by work of Hida [17], which also establishes a relation 
between congruences and certain values of L-functions; see also section 4.4 of [2] and [9]. 
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Let m denote the maximal ideal of T @ O containing the kernel of the 
homomorphism to 0 determined by fy. Define Mz as the localization at 
m of Je(Jo(N)) @z, O. One deduces from lemma 3.2 that the O-algebra 
Tx is isomorphic to (T ® O)m, so we may regard My as a module for Ty, 
hence Ry. Recall from [24] that Wiles proves the following generalization 
of results of Mazur and others: 


Theorem 4.5. The Ty-module My is free of rank two. 


The module My is equipped with an alternating pairing, (,)» that 
induces an isomorphism 
Ms = Home (My, O) 
of Ty-modules. Since My is free of rank two over Ty, the submodule 
My|ps] (the set of elements annihilated by every ¢ in px) is free of rank 
two over Ty/pr = O. On combining these facts, one shows easily that. if 
{x,y} is a basis for My|py], then 
ne = (£,y)z- 
(See [2] for the details.) 
To compare ny and ny, one defines a Ty/-equivariant map 
Ms — My. 


Its definition employs the degeneracy maps Xo(Nx-) — Xo(Nz) induced 
by + ++ p'r for i < mp». These induce by Albanese functoriality maps 
Jo(Nz:) — Jo(Nz), hence maps 
Te(Jo(Nz’)) @z, O — Te(Jo(Nz)) @2z, O > My 

which we denote 6;. A suitable Ty-linear combination of these, namely 

® do —p Toby +p td, if p ~ £ and Mp = 25 

@ 69 —p 'Ty61 ifp A £ and m, = 1; 

® do — T, 164 if p = £2 and m,z = 1, where 7 is the unit root in T's of 

X* —-T%)X+l=0; 

9 69 if p= 2 and m =0; 

induces the desired homomorphism 
B My _— Ms 

of Ty-modules (see p. 119 of [2]). 

Write §’ for the adjoint of @ with respect to the pairings (, )n and 


(, )a. A straightforward computation shows that the composite 6@’ is an 
endomorphism in Ty which is a unit times 


@ (1p) ((L+p)? —Tp) ifpt N(A), 
2 1 —p* otherwise 
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(see p. 121 of [2]). Note that this operator is cp mod py. (Moreover this 
holds with py replaced by any minimal prime of Ty and c, replaced by its 
analogue defined using the corresponding newform. Since these are non- 
zero, 39’, and hence 3’, is injective.) To obtain (9), it suffices to prove that 
@’ has torsion-free cokernel, for then a basis {x,y} for My yields a basis 
{8'(z), B’(y)} for Mg and we conclude that 


ms = ((B'(z), B'(y)) a) = ep((, y))= = cps. 
Since (’ is injective, it has torsion-free cokernel if and only if @ is sur- 
jective (or equivalently, 6’ mod A is injective). Recall that @ is defined 
using the maps on homology (or Tate modules) induced by the degeneracy 


maps Xo(Nsy) — Xo(Nx). We therefore wish to analyze the cokernel of 
the homomorphism 


Ay(Xo (N53); O) = Ay (Xo(Nz), O)mett 


gotten from the degeneracy maps. This map is not surjective in general, 
but it is enough to prove: 


Lemma 4.6. Suppose that Ny > 3p if p divides Ny. Then the map 
Hy(Xo(Nx); Ze) + Ha(Xo(Nz); Ze)? 


as surjective, where m’ is the intersection of m with the subalgebra of T® Ze 
generated by the operators T, for primes r # p. 


We sketch the proof of the lemma in the generic (and most difficult) case 
of mp = 2 and then explain what changes are needed to treat the remaining 
cases. 

First recall that a similar problem is solved in [20]. Let X1(Nz,p) be 
the modular curve associated to [)(Nz) NTo(p) and consider the map 


(10) (71 ,«) T2,x) : Hy (X1 (Nz, p), Ze) aad Ay (X1(Ns), Ze)”, 


where here and below, 71, (resp. 72,,) will denote the map induced by 7 +> 
T (resp. T ++ pr). In section 4 of [20], the surjectivity of this map is proved 
by a group-theoretic argument using results of Ihara. The surjectivity of 
(10) is a key ingredient in the proof of the lemma. 

Next consider the sequence of homology groups of non-compact modular 
curves 
(11) 

A(Yi(Nzp,p*),Ze) > Mi(¥i(Nzp),Ze)? —- Hy(¥i(Nz), Ze) 
zr > (71 ,«2, 12,%L); (y, z) aan W2,4Y — 11 x2. 

Wiles proves the exactness of this sequence by an elementary group-theo- 
retic argument (see lemma 2.5 of [26]). If we could replace X, and Y; with 
Xg in (10) and (11), we could now deduce that 


Hy(Xo(Nzp’), Ze)  Hi(Xo(Ns), Ze)® 
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is surjective. A minor complication arises when we try to make this change, 
and one finds instead that the cokernel is supported only at “Eisenstein” 
maximal ideals of the Hecke algebra. (See the last half of section 4.5 of [2] 
for more details). The irreducibility hypothesis on ensures that m’ is not 
Eisenstein, from which we deduce the lemma. 

Suppose now that mp = 1. If p ¥ Z (in which case p divides Ny), then 
one uses the exactness of (11) with Ny replaced by Ny/p. If p = 2, then 
one just uses the surjectivity of (10). In either case, the lemma follows as 
above since m’ is not Eisenstein, using also that m’ is not in the support of 
Hy (X0(Nzs/p), Ze) in the case p ~ £. There is nothing to prove if mp = 0. 


5. EPILOGUE 


Some parts of the exposition above, especially the last section, draw 
from [2]. As explained there, theorem 4.5 is actually only used in the case 
ye G 

In a recent article [10], the first author has presented a modification of 
the method of Taylor-Wiles-Faltings which makes no appeal to lemma 3.2 
and theorem 4.5.° Instead, in the modified approach, one deduces these 
two results as by-products of the proof of the Main Conjecture. The new 
idea is to use tools from commutative algebra to prove directly, without 
any initial reference to the Hecke algebra Tg, that Mg is free over Rg. 
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THE FLAT DEFORMATION FUNCTOR 


BRIAN CONRAD 


Introduction 


Let E/q be a semistable elliptic curve and p a prime. In the proof that E 
is modular, properties of the local representation 


peplp, : Dp = Gal(Q,/Qp) > Aut(Tap(E(Q,))) ~ GLa(Zp) 


play an essential role. If & has either ordinary or multiplicative reduction 
at p, then one can describe the essential properties of pz,p|p, quite ex- 
plicitly in representation-theoretic terms. However, if # has supersingular 
reduction at p, then the situation is more subtle. More precisely, the resid- 
ual representation pg |p, = PE,plp, mod p is absolutely irreducible (which 
already distinguishes this case from the other two cases) and for all n > 1, 
the finite discrete Dp-module pg,p|p, mod p” has the property that the 
corresponding Q,-group scheme arises as the generic fiber of a finite flat 
Zp-group scheme (necessarily commutative, with order p?”). 

In order to study the supersingular case, Wiles considers a ‘flat’ defor- 
mation problem which makes critical use of the theory of finite flat group 
schemes. The representability of the deformation functor in this case, as 
well as the (abstract) ‘computation’ of the corresponding representing ring 
(when p # 2), were worked out by Ramakrishna in his thesis [27]. The 
central tool he uses is Fontaine’s work on finite flat group schemes [13] 
(made more ‘explicit’ by the work of Fontaine-Laffaille [14, §9]). For p 4 2, 
Fontaine constructed an equivalence of categories between finite flat. com- 
mutative Z,-group schemes with p-power order and a category consisting 
of finite-length Z,-modules with certain extra structures. This allows one 
to reformulate the study of certain group schemes over Z, as the study of 
certain modules. Since it is far easier to manipulate and construct modules 
than it is to manipulate and construct group schemes, this reformulation 
is very useful. 

We begin with an explanation of why ‘flat’ representations are a natural 
thing to consider and outline the role they play in Wiles’ proof. After 
relating flat representations to local Galois cohomology, we formulate and 
partially prove the main results (due to Ramakrishna). We then review 
certain fundamental results due to Fontaine [13]. With these in our toolbox, 
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we complete the proof of Ramakrishna’s theorem concerning the ‘explicit’ 
structure of the deformation ring which represents the ‘flat deformation 
functor.’ This yields a theorem about Galois cohomology; we also prove 
a similar cohomological result in the residually split ordinary case (where 
the deformation functor is not known to be representable). Throughout, 
we assume a familiarity with the basic notions in the deformation theory 
of Galois representations, as given in [23], and the theory of finite flat 
commutative group schemes, as given in [34]. Our discussion provides a 
‘concrete’ application of these ideas. 

Much of what we say over Z,, Q,, and F, is valid somewhat more 
generally, but we will stick with the more concrete setting when it is con- 
venient to do so. Also, though we will frequently use Néron models of 
elliptic curves, we do not use anything from the theory of Néron models; 
all we need is that an integral Weierstrass model in PZ with a smooth 
closed fiber (and therefore a smooth generic fiber) admits the structure 
of a group scheme extending the canonical group scheme structure on the 
generic fiber. This can be proven directly [32, Ch IV, Remark 5.4.1]. 


§0. NOTATION 

We fix algebraic closures Q and Q, (for all places £ of Q), as well as 
embeddings ug : Q <> Q,. For any field K with a fixed separable closure 
K,, let Gx = Gal(K,/K). Let Dg C Ga denote the decomposition group 
at @ arising from zg. When £ 4 co, let Ig C Dz denote the inertia subgroup. 
All finite extensions of Q, are understood to lie inside of Q, and for £ # ov, 
all finite fields of characteristic 2 are understood to be subfields of the 
residue field F, naturally attached to Q,. These choices are not important, 
but making them at the start simplifies the exposition below. 

For » a finite set of places of Q, let Qu C Q denote the maximal 
subextension unramified outside of ©, with Gy = Gal(Qs/Q) the corre- 
sponding quotient of Gg. Finally, when a prime p is fixed under discussion, 
let €: Dp + Z} denote the p-adic cyclotomic character and let 


w=emodp: Dp + FF 


denote its reduction modulo p. We allow for the possibility p = 2 unless we 
explicitly say otherwise. Finally, we adopt a common abuse of terminology: 
when we say ‘finite flat S-group scheme,’ it is always understood that we 
mean ‘finite flat commutative S-group scheme’ (of finite presentation over 
S — but our bases are always noetherian). We sometimes omit reference to 
S when the context makes it clear. When S = Spec(L) with L a field, we 
will abbreviate this to ‘finite [-group scheme.’ The notation X/s denotes 
an S-scheme X. If S = Spec(A), then this is usually written X,/, instead. 
Also, for certain standard group schemes G over Z or F,, such as Gm, Qp, 
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and Up, we write Gr to denote the base extension to T, and we sometimes 
omit reference to T when the context makes it clear. 


§1. MOTIVATION AND FLAT REPRESENTATIONS 

Fix a semistable elliptic curve E7g and a prime p. Suppose p is odd, the 
restriction of pg, to Gal(Q/Q(, /(=)p)) is absolutely irreducible, and pp , 
is modular (e.g., p = 3 and Pg. 3|aaqG/q¢/=3)) is absolutely irreducible [38, 
pp. 541-542]). Wiles’ proof provides an inductive procedure to show that 
for successively larger and larger n, the representations pg, mod p” are 
‘modular’ (in an appropriate sense). Moreover, sufficiently tight control 
is kept on the level (as well as other properties) of the modular forms 
which are constructed so that one can ‘pass to the limit’ and conclude 
that pz is ‘modular.’ In order to carry out this procedure, there is an 
extremely delicate balancing act to handle, with (abstract) deformation 
rings on one side and (concrete) Hecke rings on the other side. The latter 
provide a link to modular forms and representations ‘coming from modular 
forms,’ whereas the former provide a link to the particular representation 
of interest, pz,), which we want to prove ‘comes from a modular form.’ 
The relation between the two different types of rings — leading to the 
proof that they’re isomorphic — is supplied by a numerical criterion from 
commutative algebra [9]. The hard part is to check that this numerical 
criterion actually can be applied! In order to do this, one has to prove 
highly non-obvious theorems about the commutative algebra properties of 
the rings in question. This requires a very detailed understanding of both 
the deformation rings and the Hecke rings. 

Refined knowledge about Hecke rings can be obtained via methods of 
algebraic geometry [36], essentially because Hecke rings act on algebro- 
geometric objects such as Jacobians of modular curves. However, re- 
fined knowledge about deformation rings is supplied by completely different 
methods, namely techniques from Galois cohomology. The problem of esti- 
mating the sizes of certain ‘global’ Galois cohomology groups is the central 
issue, and is the means by which the ‘flat deformation functor’ enters into 
Wiles’ proof. The cohomology groups whose order must be bounded are 
of the form H4(Gy,ad(p)), where p: Gy — GLo(A) is a continuous rep- 
resentation, with A a suitable local artin ring having finite residue field k 
of characteristic p. Here, D is a well-chosen Gy-deformation problem for 
p = pmod mg. In practice, f is essentially P¢,, up to extension of scalars 
on the residue field k. The deformation problem D imposes ‘local’ condi- 
tions on Gy-deformations of p at the finite set of places ©, including p and 
oo. In the application to elliptic curves, the other places in © consist of 
the places where E,g has bad reduction (though in his arguments, Wiles 
also needs to consider other sets ©’ which include auxiliary primes used 
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in the study of the corresponding ‘minimal’ deformation problem). The 
study of ‘type D’ deformations of p translates cohomologically into study- 
ing elements of H'(Gs, ad(p)) whose local restrictions at places in © sat- 
isfy certain conditions; these special elements constitute the A-submodule 
H3 (Gz, ad(p)). 

As is explained in [37] and (38, Prop 1.6], for any finite discrete Gyp- 
module X with p-power order (e.g., X = E[p"](Q)), the problem of esti- 
mating the size of H}(Gs,X) reduces essentially to the problem of esti- 
mating the size of certain Dg-cohomology groups for the places £€ ©. We 
say ‘essentially’ because there is another ‘dual’ group H}.(Gz, X*) that 
must be considered, too. But at a critical stage in what Wiles calls the 
‘minimal deformation -problem,’ this extra factor is not-hard to handle (this 
essentially follows from the injectivity of eg in [38, (3.1)]). 

Among the sizes of the various Dg-cohomology groups to be estimated, 
the only one which turns out to present serious difficulties is the case 
£ =p. Wiles considers two types of D,-cohomology groups, both mod- 
eled on semistability of EF at p and based on an attempt to capture the 
essential properties of pzp|p, in terms of deformation theory and Galois 
cohomology. Particularly in view of recent work of Diamond [10], these 
local deformation-theoretic/Galois-cohomological conditions at p are the 
central reason that semistability conditions enter into Wiles’ proof. Any 
attempt to prove the full Taniyama-Shimura Conjecture will almost. cer- 
tainly have to first involve the formulation of local deformation conditions 
at p which accurately describe cases with additive reduction. 

Before precisely defining the D,-cohomology group and the deformation 
functor that will concern us, we will first prove some preliminary results. 
We now allow p = 2. In order to state the first result, we recall the notion 
of a fundamental character (this will occur repeatedly below, albeit in a 
limited context). The tame inertia group If of Q, admits a canonical 
isomorphism 

3 ~limF>., 


where the (surjective) transition maps are given by taking norms (recall our 
convention about finite fields in the Notation section above). This follows 
readily from the structure theorem for tamely ramified finite extensions of 
local fields. See [30, §1] for more details and generality. The projection 


Vn: Ip» I, -» FX 
is called the fundamental character of level n (for Qp). Explicitly, if € Zp 


is a uniformizer and a, € Q; satisfies QP” —} =, then for g € Ip, 


Un(g) = gon) : 


An 
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For example, 1 = w|;, (check!). We remind the reader that the notion of a 
fundamental character is not functorial. That is, if UV is an open subgroup 
of D, with fixed field K, then we can define fundamental characters for K 
in a similar manner, but w,|y is usually not a fundamental character for 
K (even if K has residue field F,). 

Upon choosing a basis for F,2 over Fp, we get an injection of groups 


j2 : ae ad GL2(F,). 


Observe that as a 2-dimensional F,-representation for Ip, jg0w2 is semisim- 
ple and therefore irreducible (as the character w2 does not take all of its 
values in FX). This representation remains irreducible under any extension 
of scalars of odd degree over F,. However, once we extend scalars so that 
F,2 lies in the field of scalars, the representation decomposes into a direct 
sum of the two distinct characters, wo and wW (viewed as taking values in 
the field of scalars). 


Theorem 1.1. Let Eyg, be a semistable elliptic curve. If the reduction 
type is either ordinary or multiplicative, then 


€ * 
pon (B a 


with x: Dz — Za a continuous unramified character. In particular, 


= WX * 
PE,p ~ ( 0 <) ? 
with X: Dp + FX a continuous unramified character. 


If the reduction type is good supersingular, then the 2-dimensional F,,- 
representation 


Prpltp :Ip— Aut(E[p](Q,)) ~ Gle(F,) 


is isomorphic to the fundamental character of level 2. Moreover, pp, is 
absolutely irreducible and Qy ®z, Pr,p is irreducible. 


REMARKS. The irreducibility features of the supersingular case are quite a 
contrast with the relatively simple (and ‘reducible’) representation-theore- 
tic description of the other cases. The supersingular case therefore requires 
a completely different treatment in Wiles’ proof, and that is the purpose 
of the ‘flat’ deformation functor, as we shall see. The convenience of the 
general description of the representations in Theorem 1.1 (when combined 
with Theorem 1.2) is why semistable cases are much more accessible (at 
present) than cases with additive reduction. 
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The results in Theorem 1.1 for pg, in cases of good reduction were 
first proven by Serre [30, §1.11, Prop 11, 12] using formal groups. Since 
group schemes will be fundamental in our work below, we give a different 
proof that uses the theory of group schemes, particularly general theorems 
of Raynaud [28, §3] which were motivated by (but do not logically depend 
upon) the results of Serre. 

We will use Raynaud’s results to prove (in Theorem 1.8) a generalization 
of the good reduction cases under the hypothesis p # 2; however, the case 
of elliptic curves merits special treatment since in the ‘reducible’ cases we 
can interpret the unramified quotient x~! quite concretely, as is shown in 
the proof below. 

PRoor. First we consider good ordinary and bad multiplicative reduction. 
In both cases, we wish to produce an unramified rank 1 quotient of px». If 
we call the character on such a quotient x~! and recall that det pra Se. 
the rest is then clear. Suppose the reduction is bad multiplicative. In this 
case, the theory of Tate models supplies us with a D,-module isomorphism 


E(Q,) & (Q; /a“)(x) for some g € pZy and some continuous unramified 
character x : D, — (—1) [32, Ch V, Thm 5.3]. Choosing consistent pth 
roots of 1 in Qa then gives rise to a rank 1 subrepresentation inside of pz, 
and the quotient by this is visibly free of rank 1 and unramified. 

Now suppose the reduction is good ordinary. Observe that the Néron 
model €/z,, of Eq, is an elliptic curve over Zp whose p-divisible group has a 
closed fiber with non-trivial connected and étale factors (this is essentially 
the definition of what it means for the closed fiber E/p, of the Néron 
model to be an ordinary elliptic curve). Thus, the p-divisible group of 
E/z, has non-trivial connected and étale factors (since formation of the 
connected-étale sequence of a finite flat group scheme over a henselian local 
base is compatible with base change by a local map of base rings, such as 
Zp — F,). Passing to the generic fiber over Q, and then to Q,-points, 
the non-trivial connected-étale sequence over Zp gives rise to the desired 
decomposition of pgp, since a finite étale cover of Spec(Z,) has a generic 
fiber Spec(L), with L a finite product of finite unramified extensions of Qp 
(and base change preserves the exactness property of a short exact sequence 
of finite flat group schemes and thus of p-divisible groups). In particular, 
we can interpret the unramified quotient of pz.) = Tap (E(Q,)) as precisely 
Ta,(E(F>,)), via the reduction map on points. 

Lastly, consider the case in which & has good supersingular reduction 
at p. Assume that we have established the desired description of pp |r, 
in terms of the fundamental character of level 2. Let us see how to prove 
the other assertions. First of all, 6, , must be absolutely irreducible. This 


is because if it has a stable line over F,, then the eigencharacter along 
this line gives a character on D, whose restriction to Ip is either y. or 
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ws (the two characters occurring in the decomposition of the semisimple 
F,@r, Pg,plt,)- But conjugation by a Frobenius element on J, interchanges 


y and Wf (check!), so neither w nor ~ extends to an F, -valued character 
on D,. This proves that pg, is (absolutely) irreducible. If Qp @z, pry is 
reducible, then we can scale a generator of a D,-stable line so that it is part 
of a basis for the lattice pz). Reducing this mod p then would contradict 
the irreducibility of pz ,. 

It remains to prove that pz, ,|r, has the asserted form. We postpone 
the proof of this until after Theorem 1.7, as it will require some results 
from the theory of finite flat group schemes. The careful reader can check 
that this does not lead to circular reasoning. 

Theorem 1.1 shows that from the point of view of Galois representations, 
it is natural to combine the cases of ordinary and multiplicative reduction 
into one case, called ‘ord’ by Wiles, and to treat the case of supersingular 
reduction separately; this latter case is called ‘fl’ by Wiles, as its study 
involves the theory of finite flat group schemes in order to circumvent the 
lack of upper triangular representations in the supersingular case. We want 
to formulate ‘abstract’ properties satisfied by the deformation pgp of Prp 
(and we will require these for all deformations of f, , that we study). The 
‘ord’ condition has an obvious analogue for a representation into Gl(R), 
where R is any topological Z,-algebra. However, it is not clear how to 
formulate a representation-theoretic condition which both captures the su- 
persingular Tate module representations and also makes sense in deforma- 
tion theory. For example, ‘irreducibility’ of a GL2()-representation is a 
bad notion when & is not a field. The correct condition is supplied by the 
theory of finite flat group schemes, and it is a very subtle condition from a 
representation-theoretic viewpoint. 

It is worth pointing out that the finite flat group scheme techniques 
we will use to study the supersingular case are also needed at a critical 
technical step in the ‘minimal’ deformation problem for some ‘ord’ cases 
(essentially because of [38, (3.1)]; also see Example 1.3(z) below). Thus, 
what we are about to do is not only needed in the supersingular cases. 

The result which makes finite flat group schemes a relevant notion for 
us is the following fundamental fact, whose most ‘natural’ proof requires 
a serious use of algebraic geometry to handle the case of torsion levels 
divisible by p, which is the case of interest to us here. In fact, Theorem 
1.2 below was implicitly invoked in the proof of Theorem 1.1 when we used 
the theory of p-divisible groups to treat the ordinary reduction case. If 
in Theorem 1.2 one only cared about torsion levels which are prime to p, 
an argument using just the criterion of Néron-Ogg-Shafarevich (for elliptic 
curves) and Galois descent of rings [3, §6.2, Ex B] would suffice. 


Theorem 1.2. Let Eg, have good reduction. Then for all m #0, the 
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Q,-group scheme E[m] which is canonically attached to the Dy-represen- 
tation E[m] (Q,) is the generic fiber of a finite flat group scheme over Zp 
(in fact, the m-torsion on the Néron model over Zp is such a finite flat 
Zp-group scheme). 

Proor. We will first give a ‘natural’ proof that uses general principles in 
algebraic geometry. At the end we will give an ad hoc proof which is not 
at all ‘natural,’ but uses only the theory of elliptic curves over fields and 
basic facts from commutative algebra and the theory of schemes. 

Let € /Z~ denote the Néron model of EF /Q,: This is an abelian scheme 
over Zp with relative dimension 1. It suffices to prove that if A — Spec(R) 
is any abelian scheme of relative dimension d over an affine noetherian 
base {i:e., a proper smooth group scheme -with d-dimensional connected 
fibers, so necessarily geometrically connected [16, IV, 4.5.13]), then the 
multiplication map m4 : A — A is finite and flat. Base extension by the 
identity section of A,r then yields the desired conclusion that Alm] — 
Spec(R) is finite and flat. The general principle is to reduce to the case in 
which R is an algebraically closed field and then to appeal to the ‘classical’ 
theory of abelian varieties (or just elliptic curves). We give the argument 
in the case of arbitrary relative dimension for conceptual clarity. 

Here is the proof that m, is finite and flat. Certainly m, is proper. 
It is also quasi-finite, as can be checked on geometric fibers, in which case 
it is a standard result in the theory of abelian varieties over algebraically 
closed fields [24, p. 62] (or for elliptic curves in our setting of interest 
[31, Ch III, Prop 4.2(a)]). It follows that m4: A — A is a quasi-finite 
and proper map between noetherian schemes, so by [16, IV, 8.11.1] it is 
necessarily finite (one can avoid noetherian and even finite presentation 
hypotheses (16, IV, 18.12.4], but more work is then needed). The essential 
content of this statement is that f is affine. Even in the ‘concrete’ setting 
of an elliptic curve € over Z, given by an explicit integral Weierstrass 
model in PZ, it is not at all a priori obvious that the ‘multiplication by 
m’ map on € is actually affine (but we will see below that one can prove 
by elementary ad hoc means that €[m] is affine). 

For flatness, we observe that since A — Spec(R) is flat and of finite 
presentation, by the fiber-by-fiber criterion for flatness (16, [V, 11.3.10(2)] 
it suffices to check the flatness of m4 : A — A along fibers over Spec(R), 
so we may suppose that R is a field, even algebraically closed (by faithful 
flatness of field extensions). By the classical theory of abelian varieties 
(24, p. 62], ma is a map between two smooth, irreducible varieties of the 
same dimension d over an algebraically closed field, with fibers over all 
closed points of dimension d— d = 0. It is then a consequence of the 
fundamental ‘local criterion for flatness’ that such a map is necessarily flat 
[21, Cor, Thm 23.1]. 

Now we sketch an ‘elementary’ proof that E[m] is the generic fiber of 
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a finite flat Z,-group scheme (this proof will work for any good reduction 
elliptic curve over the fraction field of a discrete valuation ring). Let E 
PZ denote an integral ‘good’ Weierstrass model for Eyq,- One can check 
without appealing to the general theory of Néron models that & has the 
structure of a group scheme over Z, which extends that on the generic fiber 
E/q, (32, Ch IV, Remark 5.4.1]. It remains to check that the Z,-group 
scheme E[m] is finite and flat over Z,. By the classical theory of elliptic 
curves over fields, we know that the generic and closed fibers are both finite 
schemes (over Q, and F, respectively) with the same rank, namely m?. 
Note that the closed fiber might not be reduced! 


Assume for the moment that we know €[m] is an affine scheme, say 
of the form Spec(R) for a Z,-algebra R. Since Spec(R) — Spec(Z,) is 
universally closed, by [1, Exer. 35] (this is the essential commutative al- 
gebra input) we see that R is integral over Zp. As it is of finite type by 
construction, FR is finite over Z,. The equality of the generic and closed 
fiber dimensions then yields freeness over Zp (by the structure theorem for 
finite Zp-modules), and hence flatness. 


It remains to show that E[m] is an affine scheme. Since it is at least a 
closed subscheme of PZ, , if we can find an open affine in PZ z, Which contains 
E[m], then we will have exhibited £ [m] as a closed subscheme of an affine 
scheme, and so €[m] must be affine. The closed fiber of E[m| is finite over 
F,, 80 we can find a homogeneous polynomial fe F,[X, Y, Z| such that 
V(f) SC Pe, does not meet the closed fiber of E[m]. Choose a lifting of f 
to a homogeneous polynomial f € Z,[X,Y,2Z]. Since V(f) and E[m] are 
closed subschemes of PZ. if their intersection is non-empty, it contains a 
closed point x. Such a point maps to the closed point of Spec(Z,) by the 
properness of PZ, — Spec(Z,). Our assumption on V(f) therefore rules 
out the existence of such a point x, so E[m] is a closed subscheme of the 
open affine scheme Spec(Zp[X, Y, Z](z)) = PZ, \V(f) inside of PZ. 

We note in passing that it is not enough in the proof of Theorem 1.2 to 
have a proper, smooth, integral model for Evyq, (e.g., a ‘good’ projective 
integral Weierstrass model); it is critical to know that such a model can be 
chosen that is a group scheme (compatible with the group scheme structure 
on the generic fiber). 

The converse to Theorem 1.2 is also true in a strong sense, though 
we will not need it in any of our proofs. More precisely, if E[p”| is the 
generic fiber of a finite flat group scheme over Z, for all r > 1, then 
E7q, bas good reduction. This is a special case of a more general theorem 
of Grothendieck’s on p-divisible groups [18, IX, Cor 5.10], which says in 
particular that an abelian variety A over Q, has good reduction if and 
only if its p-divisible group I has ‘good reduction’ in the sense that it 
is the generic fiber of a p-divisible group over Z,. This is the correct 
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analogue of the Néron-Ogg-Shafarevich criterion ‘at p.’ In order to apply 
this, we need to invoke Raynaud’s theorem [28, 2.3.1] which asserts that 
Grothendieck’s criterion is equivalent to the a priori weaker condition that 
each torsion level of I is the generic fiber of a finite flat group scheme over 
Zp (i-e., we do not need to assume that there is any compatibility between 
the Z,-group schemes for each p-power torsion level). Keep in mind that 
it is essential for the converse of Theorem 1.2 that we have a condition on 
large torsion levels. This is illustrated by the first part of the following 
example. 


Example 1.3. 

(2) It is possible that E/g, can have bad reduction (and so E[p"] is not 
the generic fiber of a finite flat Z,-group scheme for some large n) while 
E[p| has good reduction (i.e., it is the generic fiber of a finite flat Z,-group 
scheme). For example, if E/q, has bad split multiplicative reduction with 
non-integral j(£’) a pth power in Q>, then the theory of Tate models shows 


PE,p ~we 1, 


which is certainly the D,-representation attached to the generic fiber of a 
finite flat group scheme over Zp, namely pp xz, Z/p. 


For Eq such that E/g, has bad reduction and E[p|,q, has good re- 
duction, Wiles’ proof studies pz,p as an ‘ord’ deformation of jg. But in 
the critical associated ‘minimal’ deformation problem, the ‘flat’ methods 
of this article are needed; see [38, (3.1)]. 


(#2) If Hq, has good ordinary reduction of the form 


= = Ww * 
PE,p = 0 1)’ 


then the p-torsion on the Néron model €/z, of Eq, is a finite flat group 
scheme over Zp with order p*, though it is not easy to describe this Z,- 
scheme explicitly. It can be shown that €[p| fits into a short exact sequence 
of finite flat commutative Z,-group schemes. Such a sequence must have 
the form 

0 Up — Elp] > Z/p > 0, 


except when €[p] ~ up xz, Z/p, in which case there is also a sequence 
corresponding to the other way of splitting E[p|]. Theorems 1.6 and 1.7 
supply the results needed to justify this claim. We leave this as an exercise 
(it will not be needed). 


Recall that for a field F’ of characteristic 0, all finite F’-group schemes 
are étale over F' [34], and so with a choice of an algebraic closure F’, we 
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get an order-preserving equivalence of categories between finite flat group 
schemes over F’ and finite discrete modules over Gr = Gal(F'/F). This 
allows us to single out certain Galois modules as special. 


Definition 1.5. Let K be a field of characteristic 0 that is the fraction 
field of a henselian (e.g., complete) discrete valuation ring R with residual 
characteristic p. Fix an algebraic closure K. We say that a continuous 
representation 


p: Gx — Aut(M) 


on a finite abelian p-group M is R-flat (or just flat when R is understood 
from the context) if there exists a finite flat group scheme Hp such that 
the Gx-representation H(K) is isomorphic to p (or, equivalently, such that 
the finite K-group scheme canonically attached to p is the generic fiber of 
a finite flat group scheme over R). 

The ‘flatness’ refers to the essential property of H — Spec(R). The 
representation p is not being required to be flat over anything. Also, it 
should be emphasized that flatness as defined above is really a property 
of finite Galois modules (though in Definition 2.1 we will extend it in 
a formal way to more general Galois modules, such as Tate modules of 
elliptic curves). Note that in the setting of the above definition, the H 
which arises necessarily has order equal to the size of the abelian group M@ 
(since by R-flatness, this can be checked after passage to a geometric fiber 
over the generic point). 

If there were many different choices for H, the notion of ‘flatness’ as 
defined above would not be a natural one to use. In addition, if the rep- 
resentation p had extra structure such as that of a vector space over a 
large finite field k (with the k-action commuting with the Gx-action of 
p), then we would like such extra endomorphism structure to extend to 
H, or else the concept of a ‘flat’ representation would likely be too weak 
to use. Fortunately, we have the following two fundamental results due to 
Raynaud. 


Theorem 1.6. (Raynaud) Let R, K, K, and Gx be as above. Consider 
the covariant functor 


Fr: H ~ H(K) 


from. finite flat p-power order R-group schemes to flat Gx-modules. When 
e(R) < p—1 (sop 2), Fp is an equivalence of categories. Moreover, the 
category of finite flat p-power order R-group schemes is an abelian category 
via the usual scheme-theoretic kernel and quotient constructions. 

Suppose e(R) = p—1 and V is a continuous k-linear representation of 
Gr ona one-dimensional vector space over a finite field k of characteristic 
p. Assume that the F,[Gx|-module underlying V is simple (i.e., the natural 
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map F,[Gx| — k is surjective). Then either there is a unique (up to canon- 
ical isomorphism) finite flat R-group scheme H with H(K) ~ V, or else 
there are two such R-group schemes, one étale and the other multiplicative 
(i.e., with an étale Cartier dual). 

The essential image of the functor Fp is stable under passage to sub- 

representations, to quotients, and to (finite) direct sums without restriction 
on e(R). 
ProoF. The proof of full faithfulness consists of reducing to the case in 
which H (K) is a simple representation, and then one writes down the most 
general form of H that could possibly give rise to the given representation 
(see Theorem 1.7 and the discussion preceding it). We discuss the quasi- 
inverse functer belew. The proof of the stability .of the essential image 
under various constructions is much simpler and is proved by the method 
of ‘scheme-theoretic closure.’ See [28, 2.1, 3.3.2(3), 3.3.6] or [34, section 4] 
for further details. 

Though [28, 2.1] refers to the fppf topology for formation of quotients, 
in our setting it is possible to get away with a more naive construction of 
quotients that exploits Cartier duality for finite flat group schemes. One es- 
sentially defines the quotient by a closed subgroup scheme to be the Cartier 
dual of a suitable kernel. We omit the details of this alternative construc- 
tion (though there is some work needed to verify flatness), as this would 
be too much of a digression here, except we note that such an alternative 
construction supplies an elementary proof that base change preserves short 
exactness and that applying Cartier duality to a short exact sequence of 
finite flat group schemes Preserves the property of being short exact. See 
[5, §2.2] for further details. i 

One can describe the ‘quasi-inverse’ to the functor in Theorem 1.6 when 
e(R) < p—1. In down-to-earth terms, consider a flat representation p witha 
representation space V, whose underlying group is a finite abelian p-group. 
With our choice of K, there is (via Galois descent) canonically attached 
to V, a finite K-group scheme Spec(,,) whose representation on K-points 
is V,. The essential content of Theorem 1.6 is that inside of K, there is 
a unique finite R-subalgebra O, which has generic fiber K, and which is 
‘stable’ under the group law (or, rather, K-Hopf algebra) morphisms on K,. 
This O, is the affine R-algebra for the unique finite flat R-group scheme 
whose generic fiber representation is V,. The functor p ~» Spec(O,) is the 
sought-after ‘quasi-inverse’ functor. 

The idea behind Raynaud’s proof is to build everything up from an anal- 
ysis of finite flat R-group schemes with generic fiber representations that 
are simple of p-power order. This generalizes earlier results of Oort-Tate 
[26, Thm 2] in the case of finite flat group schemes of order p. Consider a 
(possibly non-simple) finite discrete G,-representation V with p-power or- 
der. How many (if any) finite flat R-group schemes H admit V as a generic 
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fiber representation (i.e., V ~ H(K) as Gx-modules)? By the theory of 
Galois descent of schemes [3, §6.2, Ex B], or else by an ad hoc modification 
of the proof of classical Galois descent for fields [31, Ch II, Lemma 5.8.1], 
knowledge of such an H is equivalent to knowledge of V and of H xz R*, 
with R** the strict henselization of R (e.g., when R = Zp, RSP = es 
see [3, §2.3, Prop 10, 11]). Thus, the essential case is when R is strictly 
henselian, so we can replace R by R** (and Gx by the inertia subgroup 
Ix => G yesh). 

For R strictly henselian and V simple, k = Ende, (V) is a finite division 
ring and so by Wedderburn’s Theorem is a finite field with characteristic 
p (here, we temporarily abandon our usual convention that such a field 
is equipped with an embedding into F,). Hence, V is canonically an ir- 
reducible k[G x|-module with finite k-dimension. Because R is a strictly 
henselian discrete valuation ring with residue characteristic p, Gx has a 
pro-p normal subgroup (wild inertia) with a pro-prime-to-p abelian quotient 
(tame inertia). To justify this structure for Gx, note that the integral clo- 
sure of R in a finite (necessarily separable) extension of K is finite as an 
R-module. Since RF is a strictly henselian discrete valuation ring, such an 
integral closure is also a (strictly henselian) discrete valuation ring; thus, 
we can easily pass to the completion of R in place of R and then use [30, 
§1] to analyze Gx. 

We conclude from the structure of Gx that V is a tame (and therefore 
abelian) representation [29, Ch IX, Lemma 2], so from the structure theory 
of semisimple rings we see that dim,(V) = 1. That is, V is given by a 
continuous character 

w:GR—ak*, 
Raynaud succeeded in describing precisely the ~ which can arise in this 
way. The description is in terms of fundamental characters. Since we have 


only discussed fundamental characters for Q,, we will state the result in 
that limited context with R = Z>" (which suffices for our purposes). 


Theorem 1.7. (Raynaud) Consider a continuous character 
w : Gua =) 2 kh; 


with k a finite field of characteristic p. Choose an embedding k — FE, Let 
V be a one-dimensional k-vector space with I,-action given by y. Then V 
ts Zp" -flat if and only af 
y= Wes 

where |k| = p", Wy is the fundamental character of level n, and e = €9 + 
eip +--+ +en_-1p™ +, with0<e; < e(Zp") = 1. 

ProoF. See [28, 3.4.3] or [34, section 4] for details. The proof of ‘only 
if? proceeds as follows. Let H /Qua be the finite p-power order commu- 
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tative Qz"-group scheme associated to V, so the k-action on the (one- 
dimensional) flat I,-representation space V can be viewed as a k-action 
on H. The key step is to prove that this action necessarily extends to a 
k-action on certain finite flat Z>"-group schemes which have generic fiber 
FH [28, 3.3.1]. We remark in passing that one can exploit this to show that 
there is a unique finite flat Zo"-group scheme with generic fiber represen- 
tation V when p > 2 and there are at most two such Z>"-group schemes 
when p = 2 and V is a simple F,[J,|-module; this is used in the proof of 
Theorem 1.6. 

Note that the choice of embedding k <> F, is harmless, since changing 
this merely has the effect of cyclically permuting the ‘digits’ e;. 7 

The example of py and Z/2 over Za shows that the full faithfulness in 
Theorem 1.6 is false when p = 2, since the generic fibers of yp and Z/2 over 
Q2 are isomorphic, but the closed fibers over F2 are not (one is reduced, 
the other is not), so 2 and Z/2 are not Zs-isomorphic (as schemes, let 
alone as group schemes). Note that this also explains why Theorem 1.6 
treats the case e(R) = p— 1 separately. 

Theorem 1.7 and the discussion preceding it show that for x : Dp — FF 
an unramified continuous character, w*x : D, — F* is flat if and only ifi = 
0,1 mod p—1. The flatness condition is a therefore a very severe constraint 
on a representation. Though Theorem 1.2 shows that flat representations 
arise naturally from elliptic curves, the general problem of describing all 
flat representations (not just the ones as in Theorem 1.7) is very subile. 
An indirect answer to this question was given by Fontaine in a special case, 
and we will discuss his theory in §4. 

Gathering together our results so far, we can complete the proof of 
Theorem 1.1 and in the supersingular reduction case can give a description 
of properties of pz.) which make sense for representations into GL2(R), 
where R is any complete local noetherian ring with a finite residue field 
of characteristic p. This will be the starting point for the deformation- 
theoretic study of the supersingular case. First, we complete the proof of 
Theorem 1.1. 


Proor. (of Theorem 1.1, continued) By Theorem 1.7 and the discussion 
preceding it, it suffices to check that in the supersingular case, fg plz, is 
irreducible (and then we can apply Theorem 1.7 with k = F,2). If pz alr, 
is reducible, then by Theorem 1.6, the diagonal characters are Z>"-flat. 
Since det fz, = w, we see from Theorem 1.7 and the fact that the level 1 
fundamental character is ~ = w|;z, that there exists a short exact sequence 
of the form 


O99 = Pp pli, Xa 0) 


where {x1, x2} = {w|z,,1}. By Theorem 1.6, it follows that there exists a 
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short exact sequence of finite flat Z>"-group schemes 
0G, - E™ [p| = G2 - 0 


which on Q,-points realizes the given decomposition of Pp p|1,. Here, E™ 
is the Néron model of the good reduction curve Ejqza. It happens to be 
the case that E"" is the base extension of the Néron model of Evyg, [8, 
§7.2, Thm 1(2)], but we will not need this. What we will need is that €% 
has a supersingular closed fiber. This is clear naively, since j (E/ qua) = 
j(E/q,) is integral, and the image of this in F, is the j-invariant of the 
closed fiber. Since the reduction type is determined by the j-invariant, we 
see that €"" has supersingular reduction. 

Now assume p # 2 (so w|;, is non-trivial). Let G denote whichever of 
G, or Go has a trivial generic fiber representation. We claim that G is 
étale. If this were true, then passing to the closed fiber would prove that 
the étale factor of the finite flat group scheme E""[p] xzua Fp is non-trivial, 
and so there exist non-trivial p-torsion geometric points on the closed fiber, 
contradicting the supersingularity condition. 

Hence, for the case of p 2 we are reduced to checking the claim that 
if Hygun is a finite flat group scheme with p-power order and H (Q,) has 
trivial inertial action, then H — Spec(Z5") is étale. The generic fiber 
representation of Gqua = Ip is trivial, so the full faithfulness in Theorem 
1.6 implies that that ‘H /zse is a constant group scheme and so it is clearly 
étale over Zp". This completes the proof of Theorem 1.1 for odd p. 

Now consider the case p = 2. This will require (at the end) more 
advanced results from algebraic geometry; since the case p = 2 is only 
being included for completeness and is not used in Wiles’ proof, the reader 
can skip this case. 

Note that each G; has trivial inertial generic fiber representation. It 
is clear that 2 and Z/2 are two non-isomorphic Z5"-group schemes with 
trivial generic fiber representations, so by Theorem 1.6 we see that each G; 
is Z3"-isomorphic to 2 or Z/2. Since E""[p] is connected, G, and G2 are 
connected. Thus, each G; is isomorphic to po. 

Applying Cartier duality to the short exact sequence 


0 — pig + E™™p] — po 0, 


we get another short exact sequence. Granting that €""[p] is self-dual, 
the middle term in the Cartier-dualized short exact sequence is E""(p]. 
However, the outer terms now are isomorphic to Z/2 over Z3", so we have 
a contradiction. ca 

In order to check the self-duality of E°[p], it is not enough to know that 
the generic fiber is self-dual (via the Weil pairing), since we cannot apply 
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Theorem 1.6 (with p = 2) to the non-simple generic fiber representation, 
The essential problem is that we do not know naively that the Cartier dual 
of E'"[p| is still connected (otherwise results of Raynaud and Fontaine 
extending Theorem 1.6 for p = 2 could be used). This connectedness is 
only known after we prove the stronger assertion of self-duality. Self-duality 
follows from the Cartier-Nishi duality theorem [25, Cor 1.3(z)], applied to 
the morphism p: €°" — €"", and the canonical autoduality of elliptic 
curves [20, 2.1.2]. 

When E’gq, is an elliptic curve with good supersingular reduction, we 
can give a list of special properties of the representation p = Pp, , and its 
deformation class to GL2(Z,) represented by p = pgp. Namely, det p = w, 
p is fat (Theerem 1.2) and-absolutely irreducible (Theorem 14), while 
det o = € and for every n > 1, the finite p-power order discrete D,-module 
pmod p” is flat (Theorem 1.2). The absolute irreducibility of @ indicates 
that we are not in the ‘ord’ case. Keeping in mind the converse to Theorem 
1.2, these conditions will motivate our definition (in the next section) of the 
‘flat deformation functor’ as a deformation problem for D,-representations. 
The remarkable fact that we will be able to explicitly describe the deforma- 
tion ring associated to this deformation problem (Theorem 3.5) will allow 
us to reverse the usual process ‘H! tells us about the deformation ring’ 
to get information about a local H! term from knowledge about a defor- 
mation ring. This ring-theoretic knowledge will be obtained by methods 
based on the work of Fontaine and Ramakrishna. As we said earlier, this 
local H! data is then pieced together with other Galois cohomological data 
(at other places in ©) to tell us about the size of a global H' term, and 
thereby gives us a grip on a global deformation ring (e.g., allows us to show 
it is isomorphic to a Hecke ring after a lot more work). 

Using Raynaud’s results (Theorems 1.6, 1.7), we can somewhat gen- 
eralize the ‘good reduction’ parts of Theorem 1.1 when p # 2. This is 
useful insofar as it will allow us to state the Main Theorem 3.3 with fewer 
hypotheses (and it is an aesthetically pleasing result too). 


Theorem 1.8. Let R be a complete local noetherian ring with finite residue 
field k of characteristic p #2. Choose a flat representation 


p: Dp + GLa(k) 


with det plr, = w|z, and a continuous lift p : Dp + GLo(R) which gives 
rise to an element of DS (R) (see Definition 2.1, with O = W(k) there). 
Let 


2: Ip “3 F%, > GLo(Fp) @ GLa(k) 


be the map arising from a choice of an F,-basis of F,2. 
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(i) If p ts reducible, then there exist continuous unramified characters x; : 
Dy — k* such that 

a oo. | OA 

= (“3 x.) 


Otherwise p is absolutely irreducible and ply, ~ p2 (and so plr, @x Fy ~ 
bo © Wa)- 


(ii) If p is reducible, then there exist unique continuous unramified charac- 
ters x; : Dp — R* such that 


In particular, det p|r, = €|z, and x; mod mp = X;. 
(iit) If p ts irreducible, then det p|r, = €|z,- 


REMARKS. It is essential in Theorem 1.8(2) that we assume a condition on 
det p|;,. Otherwise one could construct counterexamples in the reducible 
case using Raynaud’s Theorem 1.7 and in the irreducible case using an 
unramified character D, —» kj, with ka the quadratic extension of k. 

In Theorem 1.8(72),(i#) the main examples of such R to keep in mind 
are O/X" and (O/A*)[e] = (O/A")[T]/(L7), with O and X as defined in 
the beginning of §2. 

The proof we give for (iii) involves Raynaud’s work on determinants of 

p-divisible groups and so ultimately relies on the Zariski-Nagata theorem 
on purity of the branch locus [17, p. 118]. With extra work, the proof of 
(722) can be extended to the case p = 2 also. The reader should appreciate 
that there is a substantial amount of algebraic geometry lying behind the 
assertion in (727). An alternative proof (valid at least for p > 5 and perhaps 
also for p = 3) can probably be obtained by a brute-force calculation using 
Fontaine’s ideas from §4, together with [14, §6, §9]. But this alternative 
procedure is not very insightful. 
ProoFr. (i) Suppose f is reducible, say with diagonal characters 7, 72 : 
D, — k*. By local class field theory for Q,, nj|z, = w™|1, as k*-valued 
characters, for suitable nj; € Z/(p— 1). Using the observations after The- 
orem 1.7, together with the fact that 


mal, = det plz, = w|1,, 


we see that {n1, 2} = {0,1}. It remains to show that if 


~~, {mm * 
p= (? 
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with 7 unramified and 72|1, = w|z,, then p splits. Equivalently, we must 
show that p has a non-zero unramified quotient. 

Consider the connected-étale sequence (see [34, section 3.7]) of the finite 
flat Z,-group scheme G with generic fiber representation 9, 


0-G°+G-G"=0. 


Recall the following: 


Unramifiedness criterion (U): For p ¥ 2, a flat D,-zepresentation is un- 
ramified if and only if the corresponding Zp-group scheme is étale over Zp. 


To prove this, we can make a base change to Z>", and then use the argument 
which proved the irreducibility of O, , in the supersingular case in Theorem 
1.1 (this amounts to the fact that for odd p, constant finite group schemes 
over Q5" admit only constant extensions to finite flat group schemes over 
Zi"). If p = 2 this critical fact is not true (consider pz over Z9”). 

Combining (U), Theorem 1.6, and the universal properties characteriz- 
ing G° and G* in the category of finite flat Zp-group schemes, it follows 
that the generic fiber representation of G*' is precisely the maximal un- 
ramified quotient of p as an F,[D,|-module. However, by functoriality of 
the connected-étale sequence, we see that the canonical k-action on G gives 
rise to compatible canonical k-actions on G° and G**. In other words, the 
generic fiber representation of G*' is also the maximal unramified quotient 
of p as a k[D,|-module. Thus, if pf is not split as a k[|.D,|-module, then 
G* is trivial and so G is a connected group scheme. But all subrepresenta- 
tions of p arise as generic fiber representations of finite flat closed subgroup 
schemes of G, and these are all necessarily connected, too. Hence, such rep- 
resentations must be ramified, by (U). This contradicts the existence of the 
unramified subrepresentation m. 

Now consider the case in which 9 is irreducible. Since det plz, = wz, 
and p is odd, it follows that p is absolutely irreducible (the argument is 
a local analogue of the global theorem that for p ~ 2, an odd continuous 
irreducible two-dimensional representation of Gg over k is absolutely irre- 
ducible). The normality of I, in D, then implies that p|;, is semisimple 
and therefore tame (29, Ch IX, Lemma 2]. Thus, f|z, is abelian, so 


Alt, Ge Fp ~ X1 B X2 


with continuous characters x; : Ip F, . These eigencharacters must take 
their values in the quadratic extension ko of k and x1 # x2 since p is abso- 
lutely irreducible (and therefore non-abelian). The absolute irreducibility 
of p then forces a Frobenius element of D, to interchange the two inertial 
lines, so from the well-known explicit description of the conjugation action 
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of Frobenius on tame inertia, we obtain 


Cie ct xi: 


Thus, x1 : J, + F% and xi(Jp) does not lie in FY. 

The flatness of implies (by Theorem 1.6) that Theorem 1.7 can be 

applied to x1, so x, is equal to either we or >. This proves that the 
semisimple k[J,|-modules 4|z, and 2 aze isomorphic over k = F',. By the 
Brauer-Nesbitt Theorem, f|;, ~ yo. 
(ii) Our argument will follow the method suggested by the brief sketch 
given in the proof of [8, Lemma 2.19(b)|. Note that we can immediately 
reduce to the case in which R is artinian (and therefore finite), so p arises 
as the generic fiber of the finite flat Z,-group scheme G. By Theorem 1.6 
(p #2), Racts on G. Let 


C33 62647 S50 


denote the connected-étale sequence of G. By the universal properties of 
the connected-étale sequence, it follows that R acts on G° and G*, so the 
generic fibers give rise to a short exact sequence of R[D,|-modules, with 
p in the middle. We will show that both G° and G* have generic fiber 
R-modules which are free of rank 1 and that the characters on these ‘lines’ 
are of the desired type. 

Now comes the critical step where we exploit the theory of finite flat 
group schemes. We claim that all Z,[D,|-module Jordan-Hélder factors 
of the generic fiber representation of G° are ramified. The reason is quite 
simple. All such representations are flat and the corresponding finite flat 
Zp-group scheme fits into a ‘decomposition series’ for the connected group 
scheme G®°. Since all closed subgroup schemes and quotients of a con- 
nected object in the category of finite flat Z,-group schemes are again 
connected (as connectedness can be determined on the closed fiber and 
base change commutes with formation of short exact sequences of finite 
flat group schemes). A non-trivial connected finite flat Z,-group scheme is 
not étale and its generic fiber representation must therefore be ramified (by 
the criterion (U) above). This proves the claim concerning Jordan-Holder 
factors of the generic fiber representation of G°. 

Let V, be the representation space underlying p and let Ve denote the 
(non-zero) maximal unramified abelian group quotient. This is the generic 
fiber representation of G** and so it has a canonical R-module structure. 
Due to the form of 7, V5 = V,/mr V5" /mr is surjective with V;>"/mr 
non-zero, so V*'/mp is exactly 1-dimensional over k. Hence, V>* has a 
single generator as an R-module. In order to show that it is a free R- 
module of rank 1, it suffices to check that it has the same R-length as R. 
We will check this below. On the other hand, the kernel V> of V, -» Vz 
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is the generic fiber representation of G° and consequently all of its Jordan- 
Hélder factors (either as an R[D,|-module or even as a Zp[D,|-module) are 
ramified. Since formation of the connected-étale sequence is compatible 
with the local base change Z, — Zp", similar reasoning proves the same 
‘ramified’ Jordan-Hélder ee peOnerey for V, when viewed as a Zp[Ip] or 
R[Ip|-module. 

Since the R-action and D,-action on V, commute, the form of p shows 
that all R[J,|-module Jordan-Hélder factors of the rank 2 free R-module 
V, are 1-dimensional k-vector spaces of the form k(1) or k. Note that since 
p # 2, kand k(1) are not isomorphic as D,-modules. Also, clearly all R{J,]- 
radius Jordan-Hélder factors of V;* nae the form k, and all such factors 
of a4 have the form k(1). Because Y is free of rank 2 as an R-module, and 
the quotient V5" has a single Ranedule generator, we see that the number 
of unramified fordan-Haldes factors is less than or equal to the number of 
ramified ones. 

We claim that it is enough to show that Vey mp is 1-dimensional over k. 
Indeed, this implies that V? and V% are both quotients of R as R-modules, 
so since V, is free of rank 2 over R, comparing lengths proves that V°* and 
Vp? are free of rank 1 over R. Set p’ = Homr(p, R(1)). Injecting R(1) 
into a finite product of copies of (Q,/Zp)(1) gives an injection of p’ into 
a finite product of copies of the flat Cartier dual Homz, (p, (Qp/Z,p)(1)), 
so p’ is a flat representation (by Theorem 1.6). A consideration of Jordan- 
Holder factors then shows that p’ mod mp is reducible with cyclotomic 
inertial determinant (so (7) may be applied) and we readily conclude that 
V) = Homr(Vs', R(1)). Thus, the inertial actions on Vt and V) are 
respectively trivial and cyclotomic, as desired. The uniqueness of the x; 
and the equality ¥; = x; mod mz are clear. 

We now verify that the right exact sequence 


0 V>/mr — V,/mr > Vz*/mr > 0 


is exact, from which the desired 1-dimensionality of Ve /mpR follows. Choose 
a finite set of generators X = {z;} for mp and consider the obvious R[D,]- 


linear map 
Dee 
rEX 


with cokernel p mod mg. If we extend this to a diagram arising from the 
connected-étale sequences associated to each side, we can apply the snake 
lemma in the abelian category of flat D,-representations. The resulting 
coboundary map must be 0 because any map between a connected and an 
étale finite flat group scheme over a local base ring must be trivial. This 
implies that V)/mp — V,/mp is injective. 

(tit) The proof we give is taken from [7, Thm 13.1(7)], which considers a 
slightly more general setting. The main point to note is that the proof uses 
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Ramakrishna’s Theorem 3.5 below, so when reading the proof of Theorem 
3.5, note that the present result is never used, and thus there is no circular 
reasoning. The idea of the proof is to use an observation of Faltings’ [38 
pp-457-8] and an auxiliary trick to reduce to the special case where O = Lins 
In that case we can use Raynaud’s fundamental work on determinants of 
p-divisible groups [28, Thm 4.2.1]. 

Using O = W(k) in Theorem 3.5 below, we know that a universal ‘flat’ 
W (k)-deformation ring Rg exists and that it is isomorphic to W(k)[T}, To]. 
If we could show that the associated universal ‘flat’ representation has 
inertial determinant e|;,, then we’d be done. The observation of Faltings 
mentioned above ensures that we can always extend the field & without 
loss of generality. Thus, assume k is large enough so that there exists a 
continuous unramified character x : Dp — k* such that w~!detp = x? 
(this is possible since D,/Ip ~ Z has no non-trivial 2-torsion). Recall 
that flatness of a D,-representation is unaffected by unramified twisting 
(due to Galois descent, as mentioned after Theorem 1.6). Thus, twisting 
by the unramified Teichmiiller lift of ~~ gives (via Yoneda’s Lemma) an 
isomorphism between the universal flat deformation rings of p and px7!. 
This reduces us to the case in which det p = w. 

It is relatively straightforward to check that up to isomorphism, there 
is only one continuous representation D, — GLo(k) which has determi- 
nant w and which is isomorphic to yg on J, (and in particular, this D,- 
representation is self-dual). In addition, this unique representation is de- 
fined over F,. Now applying (7) and Faltings’ observation ‘in reverse,’ we 
reduce to the case k = Fy, so RE = Z,[T1,T2]. Because this universal 
ring has such a special form, we see that to prove that the universal flat 
deformation has inertial determinant ¢|;,, it is enough to check this on all 
Zp-valued points. But a Z,-valued point is the same thing as a p-divisible 
group over Zp with p-torsion representation p. 

We are therefore reduced to checking that if p : D, — GLo(Z,) is the 
generic fiber representation of a p-divisible group I’ over Z, with p mod p ~ 
p, then det p|;, = €|1,- By [28, Thm 4.2.1], it is enough to check that I’ has 
dimension 1. Since j is ramified, clearly [ is not étale and so dimT > 1. 
The dual I* has generic fiber p-torsion representation p* ~ p, so dimI* > 
1. By (33, §2.3, Prop 3], 


dimT + dimI* = height(T) = 2, 


so dimI’= 1, as desired. & 

Since Ramakrishna’s Theorem 3.5 is true in the residually ramified and 
irreducible case when p = 2 (though we omit the proof of this), Theorem 
1.8(2i7) is true when p = 2. This is not needed in Wiles’ proof. 
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§2. DEFINING THE FUNCTOR 

For the remainder of this article, we will fix the following notation. Let O 
denote the valuation ring of a finite extension of Q,, with » a uniformizer. 
Let k = O/X denote the finite residue field of characteristic p. Choose a 
continuous representation 


DB: Dp > GLn(k) 


such that p is flat. Later on we will specialize to the case N = 2, p 4 2, 
and det ply, = w|;,,-but for now this is not necessary. Also, we remind 
the reader that even though we are ultimately interested in the case p = 
Px.p|D,: the Dp-representation arising from the p-torsion of an elliptic curve 
E over Q, for technical reasons in Wiles’ method it is necessary to extend 
the field of scalars from F, to a finite extension k which contains all of the 
eigenvalues of the finite subgroup Pg,,(Gq) C Gla(F,). This is one reason 
why it is critical that we work in the level of generality fixed above. 

Let Da denote the universal deformation functor attached to p, on 


the category Co of complete local noetherian O-algebras with residue field 
k. If p is absolutely irreducible (e.g., if 6 = Pg, for Eq, an elliptic curve 
with supersingular reduction), then py is representable (23, §20]. We let 
Rv denote the representing ring. Recall, as was mentioned earlier, that 
our methods below will be needed to handle certain types of ‘ord’ cases 
(e.g., those arising from Example 1.3), so it would be too restrictive for us 
to assume at the outset that pf is (absolutely) irreducible. Nevertheless, the 
absolutely irreducible case is good to keep in mind, since in this case the 
universal deformation functor is known to be representable. 

The essential problem which we need to solve for input into Wiles’ proof 
is to compute the orders of D,-cohomology groups attached to certain 
deformations of %. In terms of deformation functors, this will amount to 
computing the size of certain ‘distinguished’ subsets of D3" (A) for suitable 
artinian objects A in Ca (e.g., A = O/A"). If a representing ring Bee 
exists (e.g., if p is absolutely irreducible), then 


D3’ (A) = Home, (Ray, A), 


so if we could determine the structure of ae then we would be on the 
right track. This is the basic reason for interest in deformation rings. In 
principle, it does not matter what the deformation ring looks like, so long 
as we can somehow compute the size of certain ‘distinguished’ subsets of 
D3" (A) (and these subsets will make sense even when the universal de- 
formation ring is not known to exist). However, in many cases (e.g., super- 
singular cases) the computations we will need to make are consequences of 
a much more precise structure theorem for a ‘restricted’ universal deforma- 
tion ring, and so we will aim to prove this structure theorem and then will 
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derive its consequences for the ‘restricted’ deformation problem of interest. 
This also seems to be a more natural way to proceed, since the existence 
of a representing ring begs the question of figuring out what it is. So let us 
start by first defining the ‘flat deformation functor.’ 


Definition 2.1. For A in Co, define DE ( A) to be the subset in pay (A) 
consisting of those deformations of 6 to GLy(A) whose representative lift- 
ings 

p: Dp > GLn(A) 


have the property that for all n > 1, the finite D,-module p mod m4 is 
flat. 


A few comments are in order concerning this definition. First of all, 
note that the condition on the lifting p can be checked on any single rep- 
resentative in the same deformation class of pf, as flatness is a property of 
the isomorphism class of a finite discrete D,-module. Also, by Theorem 
1.6, we see that it suffices to check the ‘flatness’ constraint on p mod m4 
just for sufficiently large n. If A is artinian, this is the same as saying that 
the finite D,-module p is flat, in the sense of Definition 1.5. Of course, 
for any continuous representation p: D, — GLy(A), p mod m4 is a finite 
D,-module with p-power order because A is a noetherian local ring with a 
finite residue field of characteristic p. Lastly, in (38, p. 457], the definition 
analogous to Definition 2.1 is given in terms of a condition on p mod a for 
all open ideals a in A, not just the ideals m% for all n > 1. However, since 
the ideals m% give a base of opens around 0 in A, it follows that every 
p mod a is a quotient of some p mod m4, so by Theorem 1.6 it follows that 
the above definition is equivalent to that given in [38]. 


Example 2.2. Let Eg, have good reduction and p = fg. Let O = Zp 
and A = O. Then p = pgp represents an element in D5 (A). This uses 
Theorem 1.2 and the fact that p mod m% is nothing other than E[p"](Q,). 
This is the primordial example to keep in mind and is the main reason why 
we care about De here. 


Though Example 2.2 shows that the sets DE(O /A”) are the main sets in 
which we are interested, we also see that at this point we have done nothing 
of mathematical substance with these sets other than define them. It might 
even appear hopeless to say anything about such abstractly defined sets. 
We need a technique for understanding and constructing flat deformations. 
There is a theory due to Fontaine which will enable us to actually construct 
(albeit in a somewhat indirect manner) many flat deformations of p. When 
combined with the fact that De is a representable functor in certain cases 
(something we will prove shortly), we will be able to get enough of a hold 
on the representing ring that we can say exactly what it is! This will then 
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make it essentially an easy exercise to answer the questions we will have 
about flat deformations (when Ds is a representable functor). 

To begin, we need to justify a couple of facts about D®. For example, 
at this point it is not even clear if this is a functor! To make this clearer, 
consider the simpler (and essential) case of a map y: A — B in Co with 
A and B both artinian. We then have a natural map of sets 


D3" (p) . D3" (A) _ DEB), 


which, in concrete terms, just takes a representation into GLy(A) and 
applies ~ to the matrix entries to give a representation into GLy(B). What 
we would most like is D["’(y) to map the subset D(A) into D5(B). In 
other words, given a flat representation into GLy{A), if we map the matrix 
entries into B, then we want the resulting representation into GLy(B) to 
be flat. This is not obvious (until one sees the proof)! That this is the case 
is part of the next result. 


Theorem 2.3. (Ramakrishna) The association A ~+ DB (A) from Co to 
Set is a subfunctor of DZ”. That is, given any morphism py: A — B in 
Co, 
Daiv(g)(D8(A)) C DA(B). 
If @ has trivial centralizer (e.g., if it is absolutely irreducible), then DE 
is representable, by an object RS in Co. The resulting natural map 


univ fil 
Rs => Ry 


is surjective. 
Proor. Since # is assumed to be flat, we see that DE (k) is a one element 
set; that is, 
fl i = 
D3(k) C De""(k) = {a} 


is not empty. By Theorem 1.6, the property of a finite D,-module being 
flat is preserved under passage to subrepresentations, quotients, and (finite) 
direct sums. It therefore follows from [23, Prop 1, §25; Cor, §23] that Ds is 
a subfunctor of De as claimed and that when j is absolutely irreducible, 
D& is represented by a quotient ring of Rew, 

If 6 has trivial centralizer (but is not assumed to be absolutely irre- 
ducible), and if we can show that all lifts of 6 have a trivial centralizer, 
then the method of proof in [23] can still be applied (i.e., the only property 
of residual absolute irreducibility which is needed is the fact that lifts have 
trivial centralizer). 

So we are left with the problem of proving that if 
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is a representation of any group G (with k any field) and @ has trivia] 
centralizer, then for any complete local noetherian ring A with residue 
field & and any lift 


p:G— GLy(A) 


of f, p has trivial centralizer. Passing to the limit, we are reduced to the 
case in which A is a local artin ring and we induct on the length of A as 
an A-module, the case of length 1 being our original hypothesis on Dp: 

Choose non-zero z € mag, so by induction p mod z has trivial central- 
izer. Let c € My(A) commute with the action of p, soc =a mod zMy(A) 
for some a € A. Replacing c by c— a, we can assume c = zc’ for some 
c’ € My(A). Since c centralizes p, we see that c’ centralizes p mod ann(z). 
By our inductive length assumption and the fact that ann(z) € A, c’ is 
congruent to a scalar matrix modulo the annihilator of z, so we conclude 
as desired that c= zc’ is a scalar matrix. @ 

Note that the only input from the theory of finite flat group schemes in 
the above proof occurs when we invoke [23], which uses only the fact that 
the ‘flatness’ condition on a finite D,-module has certain formal properties; 
namely, it is preserved under passage to (finite) direct sums, subrepresen- 
tations, and quotients. This does not make any deep use of the theory of 
finite flat group schemes (and in particular, this part of Theorem 1.6 is not 
the hard part; far from it, in fact). 


§3. LocAL GALOIS COHOMOLOGY AND DEFORMATION THEORY 
We now review some results given in [23] and formulate them in a way that 
will be convenient for our purposes. We then will state the main result 
concerning the orders of local H'’s. When @ has trivial centralizer (so 
Ra is known to exist!), we will give an interpretation of this main result 
as a Statement about the structure of RR. The proofs of these results 
can be reduced to constructing ‘enough’ flat representations. The actual 
construction of such representations will be carried out in §5, as it will first 
require a review of Fontaine’s approach to the theory of finite flat group 
schemes (to be discussed in §4). 

Pick an artinian object A in Co and a flat lifting p of p to GLy (A), so 
the deformation given by p is an element of D5(A). There are two examples 
to keep in mind, with A= O/A", N = 2, and p # 2. These are 


p = (O @z, pr,p) mod X” and p= p;z,, mod X”. 


We have p = k @r, Pz and Eyq, an elliptic curve with good reduction 
in the first example and p = p;,, mod A in the second, with f a weight 2 
newform having level prime to p and O the completion of the integer ring 
O; C C at a prime above p. 
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Consider H'(Dp,ad(p)) in our general setting. It is shown in [23, §29] 
that as an A-module, this is naturally identified with Ext’, A[D,](P:?)- This 
is an A-module that classifies isomorphism classes of exact sequences of 
A[D,|-modules 

0-—p-X-—p—0 


in which X, necessarily an A[D,|-module with finite A-length, has a Dee 
action which is trivial on an open subgroup of D,. That is, the D,- 
action must be continuous with respect to the discrete topology on the 
finite set underlying X. Inside of Ext4(p,) (0 p), there is a natural subset 


Ext itp, (0» p) consisting of those elements with a representative short ex- 
act sequence in which X is a flat representation (note that X automatically 
has p-power order). This condition, of course, then holds for any choice of 
representative short exact sequence. 

From the formal properties of the ‘flat’ condition on representations 
(Theorem 1.6), it follows that 


Ext sip, (0:0) © Extarn,)(6, A) 


is not just a subset, but is an A-submodule [23]. Via the A-module isomor- 
phism 
Ext ip.) (0, p) ~ H*(Dp, ad(p)), 


we can define a corresponding A-submodule 
H}(Dp,ad(p)) © H!(Dp, ad(p)). 


See [23, §25] for a more detailed explanation of all of this. Beware that, 
despite what the notation may suggest, H4(Dp,ad(p)) is best thought of 
as a functor of o, not of ad(p). We will give some ad hoc definitions of 
H4(Dp, *) for other *’s, but there is not a general definition of an H4 functor 
in our situation. Note that H4(Dp,ad(p)) is exactly the H' term whose 
size Wiles needs to tightly control in his ‘local-to-global’ Galois cohomology 
estimate (for suitable A and p). 

Let’s give an easy (but important!) example of some elements in the 
A-module H} (Dp, ad(p)). 


Example 3.1 For an unramified continuous additive homomorphism 
x: D, —- A, 


we can consider X, which has the ‘block matrix form’ 
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When restricted to Ip, this is just plz, ®p|z,, so Xx|z, is certainly the generic 
fiber representation of a finite flat group scheme G over Las namely, if p 
is the generic fiber representation of a finite flat Z,-group scheme H, then 
we can take 


G = (H xz, Z)") xgua (H xz, Zp"). 


It follows from standard facts about Galois descent of rings that G canon- 
ically descends to a finite flat Z,-group scheme whose generic fiber repre- 
sentation is X,. This is the same thing that we needed in the discussion 
of descent following Theorem 1.6. 


Though Example 3.1 gives an A-line of ‘flat’ elements in H'(Dp, ad(p)), 
we can construct even more. However, we should first point out a subtlety 
in the definition of Hj(D,, ad()) which is easy to overlook but which will 
slightly complicate our life. Assume for now that p does not divide N. 
In the case of interest, N = 2 and p is odd, so this is not a problematic 
assumption. Since N is then invertible in A, if we let ad°(p) C ad(p) 
denote the A-submodule consisting of elements with trace 0, then there is 
a canonical isomorphism of A[.D,|-modules 


ad°(p) @ A ~ ad(p). 


The D,-action on the scalar line A is of course trivial. We have a canonical 
isomorphism of A-modules 


H™ (Dp, ad°(p)) ® H*(Dp, A) ~ H*(Dp, ad(p)). 


Before connecting this up with Hj, we should mention that the reason for 
interest in ad°(p) is that under the identification of H!(D,,ad(p)) with 
the set of ‘infinitesimal’ deformations of p to GLy(Ale]), H!(D,, ad°(p)) 
corresponds to those deformations whose determinant is the ‘same’ as det p 
(via the canonical inclusion A* <> Afe]*). This is explained in [23, §24] 
and is of interest because of the fact that we are primarily interested in 
studying deformations of p, , with a ‘fixed’ determinant, namely the cyclo- 
tomic character. A cyclotomic determinant condition is but one of several 
deformation-theoretic conditions satisfied by the Tate module deformation 
PE,p- The reason it is useful to impose such conditions on the deforma- 
tions we consider is that they ‘cut down’ on the size of the corresponding 
deformation ring — if it exists — and so we can expect the deformation 
ring to encode more and more refined information about the Tate module 
representation Pr p- 
Now we return to a general setting. We define 


H}(Dp, ad° (p)) = H#(Dp, ad°(p)) N HA (Dp, ad(p)) 
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and 
Ha(Dp, A)= H* (Dp, A) nN Ha(Dp, ad(p)), 


so we have the A-linear inclusion 
ee Ha(Dp, ad” (p)) © Ha (Dp, A) Hg(Dp, ad(p)). 


We might expect this map to be an isomorphism. But at the present 
time it isn’t known how to prove this in general. Nevertheless, for the 
type of p which arise in Wiles’ proof, we can prove the result. It should 
be emphasized that this is actually not needed; without this, the Main 
Theorem 3.3 below would merely be an inequality of the form <, which 
is adequate for Wiles’ needs. But it seems like a good policy to prove 
results in as strong a form as possible (who knows what will be useful in 
the future?), so we give here a proof that z, is sometimes an isomorphism. 


Lemma 3.2. Let A and p be as above, with N = 2, p #2, and det plz, = 
wlr,- Then tp is an isomorphism and 


Homeont(Dp/Ip, A) + Ha(Dp, A) 


is an isomorphism (i.e., |Hg(Dp, A)| = |A))- 


REMARK. The above result applies, in particular, if A = O/X”. This is the 
only case which arises in Wiles’ method. Also, the proof in the residually 
irreducible case ultimately relies on Theorem 1.8(7i7), whose proof requires 
a vast amount of algebraic geometry. Thus, the reader can skip Lemma 3.2 
in the residually irreducible case and insert inequalities where appropriate 
in our later arguments (i.e., Theorem 3.3) and Wiles’ method will still go 
through (and will yield equalities after the method succeeds). 

Proor. Consider the composite map 


H}(Dp, ad(p)) + H#(Dp, ad(p)) > H*(Dp, A). 


The lemma is readily seen to be equivalent to the assertion that the image 
is exactly the A-submodule Homeont(D,p/Ip, A) inside of H}(Dp, A). Note 
that it is not a priori clear that the image even lies in H}(Dp, A). Since 
Homeont(Dp/Ip, A) is the kernel of the natural restriction map 


H* (Dp, A) > H* (Ip, A), 
the lemma is equivalent to the statement that the natural map 
Ha(Dp, ad(p)) > H’ (Ip, ad(p)) 


has image inside of H1(Ip,ad°(p)). 
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Under the identification [23, §23] 
H} (Dp, ad(p)) ~ D®(Alel), 
our assertion is equivalent to the statement that if 
pi: Dp > Gha(Ale}) 


is a flat deformation of p, then det pi|;, = det plz,. Since both p and p, 
are flat deformations of 7, both have the same inertial determinant, namely 
e|r, (treat the cases in which 7 is reducible and irreducible separately, using 
Theorem 1.8). & 

There does not (at the present time) exist a satisfactory theory of any 
sort of Hj functor in our setting. For us, H4(Dp,ad(p)) is primarily an 
artifice for singling out a special submodule of ‘distinguished’ cohomology 
classes, and we need to find out how many of them there are in certain 
special cases. 

We now are in a position to state an important result that Wiles needs. 
This is the main result of the present article. 


Main Theorem 3.3. Let p : Dp — Glo(k) be flat with det plz, = w|1, 
and p #2. Let 
p: Dp > GL2(O/X") 


be a flat lifting of p. Then 


|Ha(Dp, ad” (p))| = |H°(Dp, ad ())| - |O/A"|- 


REMARKS. Using Theorem 1.8(i7),(i2), the hypotheses imply that 
det pr, = €lz,- 


However, in the residually irreducible case the proof requires sophisticated 
methods. The condition is automatic in the application to elliptic curves, so 
the reader can insert this as an extra hypothesis in Theorem 3.3 and thereby 
bypass Theorem 1.8(%7) without affecting the applicability of Theorem 3.3 
to the study of semistable elliptic curves over Q. 

Since Lemma 3.2 in the residually irreducible case requires Theorem 
1.8(2#7), the reader who would prefer to avoid Theorem 1.8(i#) should 
insert < in place of = in Theorem 3.3 above. Such an inequality suffices 
for the successful application of Wiles’ methods to elliptic curves. 


Before addressing the proof of Theorem 3.3, we make a few observations 
about what it says. The H° term is nothing other than the number of trace 
0 matrices in M2(O/A”) which commute with the action of p. For example, 
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if p is not split, then by Theorem 1.8(7) we see that p has trivial centralizer. 
We saw in the proof of Theorem 2.3 that this implies that p must also have 
trivial centralizer. In particular, the only trace 0 matrix commuting with 
p is the zero matrix, so |H°(D,, ad’(p))| = 1 in these cases. 

Now we prove Theorem 3.3. By Lemma 3.2, the injection 1, is an 
isomorphism and 


|Ha(Dp, O/A")| = |O/A"|, 
so we have 
| Ha(Dp, ad” (p))| = |O/A"|"* | Hg (Dp, ad(p))|.- 
Therefore, it suffices to prove 


Lemma 3.4. Under the hypotheses in Theorem 3.3, we have 
| Hi(Dp, ad(p))| = |H°(Dp, ad”(p))| - |O/A"?. 


Since Theorem 1.8(ii7) is true for p = 2 and Theorem 4.5 below can be 
extended to the case p = 2 with some connectedness hypotheses, Lemma 
3.4 can be extended to the case p = 2 when f is irreducible. We omit the 
arguments needed for this case, as it is not needed in Wiles’ proof. 

The idea of the proof of Lemma 3.4 is quite simple: we will write down 
every flat representation corresponding to an element in H4(Dp, ad(p)) ~ 
DE ((O/ A”)[e]) and just count how many we have! This is not the most 
conceptually satisfying way to proceed, and in the case where jp has trivial 
centralizer, Fontaine has said that he can give a more ‘pure thought’ proof 
(with a small amount of computation required). We will say more about 
this below. We use a computational ‘brute force’ proof because it involves 
a really clever idea, and also because it is needed (at present) to handle 
the cases with non-trivial centralizer at the residual level. Regardless of 
how one proves Lemma 3.4, Fontaine’s work on finite flat group schemes is 
indispensable. 

Before proving Lemma 3.4, we make some remarks in the case when 
p # 2, p as above has trivial centralizer, and p = pp mod A” for some 


po : Dp > GL2(O) 


lifting 6, with po corresponding to an element in D5 (0). In practice, po is 
the local restriction of a modular lifting of p = pg, @r, k, but we will see 
in Theorem 3.5 below that some po as above always exists. The following 
analysis serves as a good indication of what Lemma 3.4 really means. 
Since p has a trivial centralizer, Lemma 3.4 amounts to the assertion 


|Hg(Dp, ad(p))| = |O/A" |’. 
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By Theorem 2.3, the functor De is represented by some Re in Co: Let us 
interpret Lemma 3.4 as an assertion about RE. In the case n = 1, Lemma 
3.4 says that the k-vector space H4(D,, ad(p)) is 2-dimensional. Recalling 
the fundamental isomorphism relating Galois cohomology and deformation 


theory [23, §28] 
Hom,(m/(m?”,),k) ~ Ha(Dp, ad(p)), 


with m the maximal ideal of RE, we see that choosing a k-basis of m/(m?, A) 
gives rise to a surjection of rings 


TN: O[T1, Ta] — Fe. 


We now consider whether z is an isomorphism. 

Let p denote the kernel of the natural map RS —» OQ induced by po. By 
suitable change of coordinates, we can assume p is the image of (71,7>) 
under 7, so p has two generators. The method of proof of [38, Prop 1.2] 
produces an O-linear isomorphism 


Ha (Dp, ad(p)) ~ Homo(p/p*, O/d") 


(this construction uses the choice of the representation pp corresponding 
to p). Hence, Lemma 3.4 says that the O/A"-module p/(p?, A"p), which 
has two generators, is in fact free of rank 2. Passing to the limit (using 
po mod A” with m > n), this says that p/p is a free O-module of rank 
2. This fact is not at all obvious. Moreover, passing to the direct limit on 
cohomology gives 


H}(Dp,ad(p0) ®o (K/O)) “= lim H4(Dp, ad(p9 mod X™)) 


(K/O) @ (K/0), 


so this cohomology module is actually pdivisible as a group. This is all 
very important in [38, Prop 1.9(v)]. What sort of structure must R§ have 
if p/p? is free of rank 2? One possibility is that the surjection a above is an 
isomorphism. This is much stronger than the statement that p/p? ~ O@O 
(e.g., a priori, we could have RS ~ O[T,, Ta] /(Ti, T2)*, which has a unique 
O-valued point p and p/p? ~ O@®O). Nevertheless, if a is an isomorphism, 
this is a ‘good’ explanation for the O-module freeness. This may seem like 
a lot to ask for, but Ramakrishna came up with a very clever way to prove 
that indeed, a is an isomorphism. 


Theorem 3.5. (Ramakrishna) Assume p # 2. For a flat representation 
p: Dy — GLa(k) with trivial centralizer and with det p| I, = w| on 


RS = O[T:, To]. 
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In particular, the natural map DS(O) _ DE(O/ A”) is surjective for all 
n> 1. 

Proor. The centralizer hypothesis ensures that RS exists. Let m be the 
maximal ideal of R$. Using Fontaine’s theory from §4 (this requires p # 2), 
we will write down in §5 all short exact sequences of flat k|D,|-modules 


and count that there are |k|? of them (up to equivalence). Since we have 
k-linear isomorphisms 


Homs(m/(m?, d), k) ~ H}(Dp, ad(p)) ~ Extyi®, (0,2), 


we obtain a surjection O[T1, 72] -» R4. Let I denote the kernel. 

A natural approach to proving IJ = 0 is to try to use methods from com- 
mutative algebra. But our knowledge of the commutative algebra proper- 
ties RE is (right now) quite minimal. Instead, we will exploit the defining 
property of Re in the following remarkable manner. Choose f € I. If we 
can show that the induced injective map on O/A”-valued points 


Home, (R#,O/")  Homg, (O[T1, T2], O/”) 


is a bijection, then it follows that f(t,,t2) € A"O for all t1,t2 € AO and 
alln >1. Thus, f vanishes on the open A-adic unit disc and so by a basic 
result from non-archimedean analysis we may conclude that f = 0! 

To show that the injection on O/”-valued points is a bijection, all we 
have to show is that both sides have the same size. Since the right side 
trivially has size |AO/A"O|? = |k|?("-)), what we need to show is 


|Home (R4,0/A")| = |k[20e—-Y, 


But | Home¢. (RE, O/X")| actually means something: it is the number of flat 
deformations of to GL2(O/A”). We again use Fontaine’s theory (p # 2) 
to simply write down all such possible deformations (this will be done in 
85) and count the number of possibilities; it turns out to be exactly what 
we want. i 

Note that the above proof shows that any surjection O[T, Ta] > Ra 
is an isomorphism (which is to be expected, since any surjection from a 
noetherian ring to itself must be an isomorphism). One may ask if the 
isomorphism in Theorem 3.5 can be chosen ‘naturally.’ That is, can one 
‘interpret’ what the parameters T, and T> actually mean? In the proof of 
Theorem 3.5, we chose arbitrarily a basis of the reduced Zariski cotangent 
space m/(m?, A) and then arbitrarily lifted these to elements of m. Is there 
a natural way to make these choices? Recent work of Fontaine and Mazur 
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suggests that natural choices might exist. In the special case of Theorem 
3.5, when O = Z,, p > 5, and f is absolutely irreducible, they give an 
explicit parameterization of Home. (Ra, O) which assigns explicit meaning 
to T; and T> [15, Thm B2(iz)]. One of the ingredients for this is an analogue 
of the ideas to be described in the next section. 

The fact that Re is a power series ring should have a cohomological in- 
terpretation as a vanishing condition on an appropriately defined H? (com- 
pare with [22, §1.6, Prop 2] in the ‘unrestricted’ case). Fontaine has said 
that he can prove such a vanishing condition directly, using [2, Lemma 4.4] 
and a ‘cohomological dimension’ argument, thereby giving a conceptual 
proof that Re is a power series ring. One then needs to do the calculation 
that-dim, Ext, (2,7) = 2 as above in order to prove that the number 
of variables is 2. This procedure has the advantage of working for GLy 
and thereby highlights the role of GLz as an artifact needed only for the 
residual Ext’ calculation (which requires Theorem 1.8(i)). Also, one can 
use the theory of Fontaine-Laffaille to bypass the residual Ext’ calculation 
and to compute directly the k-dimension of H¢(D,,ad(p)). This produces 
the number 2 via a completely different (but much more complicated) cal- 
culation that explains more conceptually where the ‘2’ comes from. See [4, 
pp. 6-11, esp. Thm 5] for further details. 

It is fairly easy to use Theorem 3.5 to deduce Lemma 3.4 in the case 
of a residually trivial centralizer. Indeed, Theorem 3.5 proves that for any 
flat lift 9: Dp, — GLo(O/X") of f, there exists pp : Dp — GLo(Q) lifting 
p with po giving an element in De (O). Hence, we can apply the discussion 
preceding Theorem 3.5, and combining this with RS ~ O[T, To], Lemma 
3.4 follows if p has a trivial centralizer. In §5, we will give a direct proof 
of Lemma 3.4 in all residually reducible cases, as well as complete the 
unfinished steps in the proof of Theorem 3.5. 

We conclude this section by showing how imposing extra deformation 
conditions cuts down quite a lot on the ‘size’ of the deformation ring. For 
example, suppose 

p: Dp — Gla(k) 


is flat and irreducible, with cyclotomic determinant on inertia. Also assume 
p #2. Ramakrishna showed using Tate local duality that 


a = O[X, Xa, X3, Xa, Xs], 


where 5 = dim; ad(p) + h°(Dy, ad(p)) [27, Thm 4.1], (38, pp. 457-8] (re- 
mark: the cohomological calculations used to prove [27, Thm 4.1] can be 
simplified by the use of a k-linear version of Tate local duality rather than 
just an F,-linear version). By Theorem 3.5, we see that imposing a flatness 
condition on the deformations of f yields the quotient R& ~ O[T;,T»] of 
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Roy, Moreover, by [7, Thm 13.1(i)], it follows that if we impose the added 
deformation condition that the determinant be some fixed O*-valued char- 
acter x lifting det, with x|z, = elz,, then the resulting deformation ring 
is isomorphic to a quotient of the form O[T]. From an intuitive viewpoint, 
if we required the D,-representations to actually come from global repre- 
sentations of a suitably restricted type, then this would cut down on the 
deformation ring even more. Hence, one way to think about why the global 
deformation rings considered by Wiles are so ‘small’ is that the functors 
they represent involve a lot of (local) constraints! 

There is one interesting property of all of the Galois deformation rings 
which have been computed, whether local or global: they are flat over OQ. 
Does this fact have any deep meaning? 


§4. FONTAINE’S APPROACH TO FINITE FLAT GROUP SCHEMES. 

The classical theory of complex (or real) Lie groups can be ‘linearized’ in- 
sofar as the theory of Lie algebras often allows one to translate theorems 
and constructions concerning (connected) Lie groups into issues concerning 
Lie algebras. Since the Lie algebra only perceives tangential information at 
the origin, an attempt to construct a theory of Lie algebras in the context 
of algebraic groups is reasonable as long as the group schemes are reduced 
(since tangent spaces can’t distinguish between a scheme and the underly- 
ing reduced subscheme). In particular, everything is fine in characteristic 
0. However, in characteristic p there are many group schemes which are 
not reduced. This is the most important fact about the theory of group 
schemes in characteristic p. Any attempt at constructing a ‘Lie algebra’ 
theory for non-reduced group schemes over a field of positive characteristic 
must use more subtle infinitesimal information than that detected at the 
level of tangent spaces. 

Let k be any field of characteristic p > 0. For the study of finite 
k-group schemes (recall the commutativity hypotheses) there is a theory 
of Dieudonne modules which serves as a good analogue to the theory of 
Lie algebras, up to the fact that the Dieudonne theory is contravariant 
(like a cotangent space rather than a tangent space). In fact, the theory 
of Dieudonne modules covers a much wider class of commutative k-group 
schemes than the finite ones, but we restrict ourselves to this case, as it is 
all that we will need. Before discussing the basic ingredients of this theory, 
we mention that Fontaine’s idea (following Grothendieck) is that a finite 
flat group scheme G over Z, should be classified by specifying its closed 
fiber (a finite F,-group scheme, or equivalently, a ‘Dieudonne module’), 
together with some additional ‘lifting data.’ In other words, he proposed 
a refinement of the theory of Dieudonne modules which would create an 
analogue to the theory of Lie algebras for finite flat Z,-group schemes. 
Fontaine’s theory actually applies to finite flat group schemes over any 
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base W(k) with k a perfect field of characteristic p. We will only discuss 
the case k = F), as this case is all we shall need and it avoids some techni- 
cal Frobenius-semilinearity issues (thereby making computations simpler). 
Omitting this Frobenius-semilinearity may create some mistaken impres- 
sions about the shape of the more general theory, so what follows should 
be taken as a slightly skewed perspective on Fontaine’s theory. 

The reason that it is preferable to classify a finite fat Z)-group scheme 
by its closed fiber, together with ‘extra data,’ rather than by its generic 
fiber, together with ‘extra data,’ is that closed fibers can be classified by the 
very explicit linear-algebraic notion of a Dieudonne module. The generic 
fibers, on the other hand, constitute the entire theory of finite discrete 
modules over D,. Since the structure of D, is a still quite a mystery, using 
Theorem 1.6 to ‘classify’ finite flat Z,-group schemes via flat representa- 
tions is not a very useful ‘classification.’ 

We now give the fundamental classification of finite F,-group schemes 
via Dieudonne modules. An essentially self-contained development of the 
general theory of Dieudonne modules is given in [12, Ch II]. Define the 
‘Dieudonne ring’ D = Z,[F, V]/(F'V — p) to be a ring with the variables F 
(‘Frobenius’) and V (‘Verschiebung’). By a finite D-module, we will mean 
a D-module with finite Zp-length (and so the underlying abelian group 
is finite). The category of finite D-modules is an abelian category in an 
evident way. The category of finite F,-group schemes with p-power order 
is also an abelian category, using scheme-theoretic kernels and quotients. 
This is one of the essential ingredients in the proof of 


Theorem 4.1. (Dieudonne-Cartier) There exists a contravariant additive 
anti-equivalence of abelian categories M : G ~ M(G) from the category 
of finite F,-group schemes of p-power order to the category of finite D- 
modules. Moreover, the order of G is equal to the order of M(G) (i.e., 
pize(M(G))) 
PrRoor. See [12, Ch III] for Fontaine’s proof, where it is obtained from 
more general considerations in the setting of certain formal commutative 
group schemes over an arbitrary perfect field of characteristic p. & 
For a finite F,-group scheme G, the Dieudonne module of G is the finite 
D-module M(G) from Theorem 4.1. Just to give a hint as to where the con- 
struction of M(G) comes from, let us look at finite abelian p-groups, which 
are the same thing as finite C-group schemes of p-power order. Cartier 
duality G ~ G*, with 


G*(T) = Homr(G xc T, Gmr) 


for C-schemes T,, becomes on C-points just classical duality of finite abelian 
p-groups: 


G ~ G* = Hom(G,C”) = Hom(G, Q/Z) = Hom(G, Q,/Zp). 
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The Q,/Z, term is a direct limit of the groups Z/p” = W,,(F,), where W,, 
is the ‘Witt ring scheme of length n.’ Thus, if we could somehow exploit 
the group/ring-scheme theoretic properties of the W,,’s, we could try to 
construct a ‘schemified version’ of Q,/Z, as a formal scheme. Considering 
group scheme maps into such an object would give a reasonable candidate 
for a ‘linear algebra’ object attached to G. The actual construction of 
M(G) requires some care (e.g., Wn is a finite type scheme, not finite or 
even formal), but the above gives the flavor of the basic idea. 

The essential content of Theorem 4.1 is that one can pass from ‘linear 
algebra data’ such as a finite D-module and produce something as subtle as 
a (possibly non-reduced) F,,-group scheme. By considering how the functor 
Wis constructed as a sort of dual and trying to-mimic ‘double duality’ to 
get a quasi-inverse to M, the general proof of Theorem 4.1 uses a finite 
D-module M to define a functor (analogous to a ‘double dual’) from finite 
F,-algebras to abelian groups. Then a general (and essentially formal) 
‘pro-representability’ theorem of Grothendieck’s is invoked. See [5, §1.4] 
for a precise formulation and proof of this ‘pro-representability’ theorem 
in the form needed. In this way, one gets a commutative formal F,-group 
scheme whose affine ring Ry, is an inverse limit of finite F,-algebras over an 
enormous index set. One then has to show that Ry is actually finite over F, 
(and that the finite F,-group scheme Gy = Spec(Ry) has M(Gyz) ~ M 
naturally in M). But at least Ra, is a ring to work with, albeit an abstract 
one (so Theorem 4.1 is not a complete black hole). 

There are more refined versions of Theorem 4.1 which translate various 
notions from the theory of finite F,-group schemes over into the language 
of finite D-modules. We give a limited sampling that is all we shall need. 


Lemma 4.2 Let G be a finite F,-group scheme with p-power order, M = 
M(G). Then G is étale if and only if F(M) = M (or, equivalently, F : 
M — M is a Z,-linear isomorphism) and G is connected if and only if the 
action of F is nilpotent. Define M* = Homz,(M,Q,/Zp) and let F and 
V act on M* as duals to the actions of V and F on M respectively. In 
this way, M* has the structure of a finite D-module and there is a natural 
isomorphism of finite D-modules 


yo : M(G*) > M*, 
with G* the Cartier dual of G. 


The étale criterion in Lemma 4.2 corresponds to the fact that a finite 
F,-algebra R is étale if and only if r t+ r? is an automorphism of R. 
The connectedness criterion corresponds to the fact [12, Ch I, Rem 9.5.2| 
that a finite connected F,-group scheme always has an affine ring R = 


F,[X1,---,Xm]/(XF = ), so it necessarily has p-power order, and for n > 
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max(n;) the nth iterate of r++ r? kills the augmentation ideal of R. The 
isomorphism yg is constructed in [12, Ch II, §5] in a very indirect man- 
ner. In particular, it is not at all clear from the definition that yg is a 
‘symmetric’ duality pairing in the sense that the natural diagram 


May" *% M(G")" 
~T T Ye- 
Mia) “&S) MG") 


commutes, where ag : G ~ G*™ is the canonical isomorphism. This is a 
complicated technical point which we will not need, so we will not discuss 
it. What we will need is the existence of the (natural) isomorphism yg. 


Example 4.3 By Theorem 4.1, M(Z/p), M(up), and M(a,) are all 1- 
dimensional over Fy. The actions of F' and V on these are given as follows: 
GH=LjpF=1.V]0; 

G=pp: F=0,V=1, 

G=ap: F=V=0. 

By Lemma 4.2, the only thing which remains to be checked is that V = 1 
when G = py. This follows from [12, Ch I, §8.7; Ch III, Prop 4.3]. 


In general, it is possible that non-isomorphic finite flat Z,-group schemes 
can have isomorphic closed fibers. In the notation of [26, Rem 5, pp.15- 
16], if 7? = p and we work in the category of finite flat group schemes 
over Zp|7], then Cr i and G72 z, [gj are non-isomorphic, yet they have 
isomorphic closed fiber a». With more'work, one can construct examples 
over Zp as well. In other words, the analogue to Theorem 1.6 for passage 
to the closed fiber is false: the functor G ~» G xz, F, is not fully faithful. 
The emphasis here is on the ‘fully’ part; it is shown in the course of the 
proof of Theorem 4.5 below that this functor is faithful for p # 2. 

Thus, if one wishes to describe finite flat Z,-group schemes in terms of 
‘linear algebra data,’ then one needs to find some ‘extra structure’ within 
M(G xz, Fp) that encodes the lifting G to Zp. Using the affine ring of 
G/z,, Fontaine constructs a Z,-submodule of ‘logarithms’ 


L(G) C M(G xz, Fp) 


which is not necessarily stable under F' and V but which satisfies the fol- 
lowing two properties: 

(1) Vice) : L(G) — M(G xz, F>) is injective 

(2) The natural Z,-linear composite map 


L(G)/p > M(G xz, Fp)/p » M(G xz, Fp)/F 
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is an isomorphism. 
This is valid even if p = 2. 


For a finite F,-group scheme H, M(#)/F can be canonically identified 
with the cotangent space of H at its origin [12, Ch III, Prop 4.4(72)], so the 
second condition above shows that £(G) provides some sort of ‘minimal ba- 
sis’ lifting of the cotangent space, partially explaining the name ‘logarithm.’ 
[12, Ch IV] develops the ideas which motivate the construction of L(G), as 
well as the techniques which are needed to actually construct it. The brief 
article [13] gives an outline of the actual construction, whose justification 
relies on the theory of p-divisible groups; nearly the entire contents of the 
book [12] are needed for this! For a more detailed explanation of [13], see 
(7, §1]. 


It is now reasonable to make the following definition, following Fontaine. 


Definition 4.4 The category SH‘ of finite Honda systems (over Zp) con- 
sists of pairs (L,M) with M a finite D-module and L a Z,-submodule 
satisfying the properties (1) and (2) above. The notion of a morphism is 
defined in the obvious manner. 


One can show directly that SH is an abelian category, using the ob- 
vious candidates for kernel and cokernel as the kernel and cokernel ob- 
jects. See [14, §1, §9], noting that [14, Prop 9.10] provides a translation of 
SH! into the language of finite filtered modules MF! as is used in (14, 


Prop 1.8]. In addition, the construction of £(G) is sufficiently natural for 
a finite flat Z,-group scheme G so that 


LM :G ~ (L(G), M(G xz, Fp)) 


is an additive contravariant functor from the category of finite flat Z,- 
group schemes to the category SH’. We have the following fundamental 
fact, whose proof makes essential use of the theory of p-divisible groups: 


Theorem 4.5 (Fontaine [13]) For p 4 2, the additive contravariant functor 
LM is an anti-equivalence of abelian categories. 


The idea of the proof is to simultaneously show that all finite flat 
Zp-group schemes embed into p-divisible groups over Z, and to invoke 
Fontaine’s classification theory for such p-divisible groups (when p # 2) in 
terms of a ‘finite free’ analogue of the notion of a finite Honda system. The 
special fact about p-divisible groups used in all of this is that over a field 
k of characteristic p (where the Dieudonne theory classifies various com- 
mutative group schemes when the field is perfect), a connected p-divisible 
group is the same thing as a finite-dimensional formal Lie group I such 
that for all finite k-algebras R, every element of the abelian group I'(R) 
has p-power order. 
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Theorem 4.5, in conjunction with a good understanding of the con- 
struction of LM (e.g., the theory of Dieudonne modules, Lemma 4.2, etc.), 
enables us to translate questions concerning the construction and study of 
finite flat Z,-group schemes (for p * 2) into analogous questions in the 
setting of SH‘, where (in principle) everything is just a matter of ‘linear 
algebra’! There are more refined questions one could ask, such as whether 
one can give an explicit description of the functor that passes from a finite 
Honda system to the generic fiber D,-representation of the associated finite 
flat Zp-group scheme. This can be done and is the essential point of [14, §9]. 
We will not need this. However, we note in passing that this pomt has led 
to the misunderstanding that [14] is critical to the proof of Ramakrishna’s 
theorem and Wiles’ proof of the modularity of semistable elliptic curves 
over Q. This is not true. Everything which Ramakrishna and Wiles need 
is contained in Theorem 4.5, which was proven by Fontaine long before [14] 
was written. Nevertheless, we remind the reader that [14, Prop 9.12] can 
still be useful in the present setting; the main issue that [14, Prop 9.12] 
enables us to handle is the analysis of generic fiber representation ‘tensor’ 
constructions such as ‘determinant’; see also [14, §6.13(6)| and its corrected 
form in (6, §7.11]. Generic fiber representation ‘tensor’ constructions are 
difficult to study solely from the point of view of group schemes because 
the representation-theoretic notion of a tensor product has no analogue in 
the context of group schemes. 

Before using Theorem 4.5 to construct Z,-flat representations of D, in 
the next section, we give a modified formulation which is used to handle 
the cases in which some finite extension O of Z, acts on the representation 
space. Keeping in mind Raynaud’s full faithfulness result (Theorem 1.6), 
it is not hard to deduce a variant of Theorem 4.5 in the following manner. 

Let A be a finite local Zp-algebra and let Da = A[F,V]|/(FV — p). 
We can define the notion of a finite D,-module in the obvious way; the 
category of such objects forms an A-linear subcategory of the category of 
finite D-modules. In a similar way, we can define a finite A~-Honda system 
(L,M) by replacing Z, by A in the definition of a finite Honda system 
(still using L/p and not L/m, in condition (2)). We then get a category 
SH a which one readily checks is an abelian subcategory of SH/, with the 
‘forgetful functor’ SH{ — SH‘ exact. We then have 


Corollary 4.6 For A as above and p # 2, the functor LM induces an 
A-linear anti-equivalence of abelian categories between the category of finite 
flat Z,-group schemes with an A-action on the generic fiber representation 
and the category SHi. If A =O or O/” and G is a finite flat Zp-group 
scheme with an A-action on its generic fiber representation p, then p and 
M(G xz, Fp) are non-canonically isomorphic as A-modules. 

Proor. The only point we need to check is the final one. For a finite- 
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length O-module N, it is clear that the abstract O-module structure of N is 
determined by the O-module structure of AN and the value of dim;(N[A)). 
Arguing par devissage and using the fact that the dimension of a k-vector 
space is determined by the underlying F’,-vector space, we are reduced to 
the case O = Z, and objects killed by p. It now suffices to invoke the fact 
that p°%»(M(Gxz,F»)) is equal to the order of G xz, Fp, which is equal to 
the order of G. & 

§5. APPLICATIONS TO FLAT DEFORMATIONS 

We now apply Fontaine’s theory to complete the proof of Lemma 3.4 and 
Ramakrishna’s Theorem 3.5. For the proof of Lemma 3.4 in the residually 
reducible (i.e., ‘ord’) cases, we give a variant on the arguments in [27] and [8, 
§2.5}. Our argument is different insofar as we make direct use of the results 
in §4, rather than work in the language of Fontaine-Laffaille modules. This 
makes the role of the Dieudonne theory more explicit and clarifies the role 
of the theory of finite flat group schemes in the calculations. 

After finishing the proof of Lemma 3.4 in the residually reducible case, 
we will carry out the unfinished calculations from our earlier sketch of the 
proof of Theorem 3.5 (which, as we have already seen, implies Lemma 3.4 
in the residually irreducible case). 

Let p and p be as in Lemma 3.4, with p # 2 and p possibly irreducible. 
By Theorem 1.6, there is a canonical finite flat Z,-group scheme G(p) with 
generic fiber representation p. By Corollary 4.6, we obtain an object 


LM(p) = (L(p), M(p)) = (L(G(p)), M(G(p) xz, Fp)) 


in SHE. As an O-module, M(p) is free of rank 2 over O/X”, by Corollary 
4.6. Observe that we have natural O-linear isomorphisms 
Hj(Dp,ad(p)) ~ D,((O/A")le]) 
1,8 
Ext(o/a» [Dp] (9 P) 


~ Extsut (LM(p), LM(p)). 


I 


Using the above chain of isomorphisms, the following gives a more precise 
version of Lemma 3.4. 


Theorem 5.1. As O/A"-modules, there is a non-canonical isomorphism 


Ext set, (LM(0), LM(p)) = (O/A") @ (O/2") @ H? (Dp, ad”(p)). 


Proor. For any finite O-Honda system (L, M), the composite map 


L/p— M/p > M/FM 
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is an isomorphism of O-modules. If we apply ok, we get a composite 
isomorphism 

L/\ > M/X— (M/FM) @o k, 
so the map L/ — M/)d is injective. Thus, L is necessarily an O-module 
direct summand of M. 

Consider the pair (L(p), M(p)). We claim that L(p) #4 0 and L(p) 4 
M(p), so therefore L(p) is a rank 1 free O/A"-module direct summand of 
M(p). If L(p) = 0, then M(p)/F(M(p)) = 0, so by Lemma 4.2, G(p)xz_Fp 
is étale. But then G(p) is étale over Zp, in which case p is unramified. This 
contradicts the fact that 6 = p mod X has a ramified determinant. 

If L(p) = M(p), then the injectivity of V on L(p) implies that V is an 
automorphism on M{p). Hence, by Lemma 4.2, the Cartier dual of G(p) 
is a finite étale Z,-group scheme, so the Cartier dual p* of p is unramified. 
Thus, the Cartier dual p* of p is unramified, since p* ~ p*[A]. This is 
inconsistent with the classification of possibilities for p in Theorem 1.8(7). 

Choose a basis €), €2 of M(p) over 0/2” such that eg is a basis for the 
direct summand L(p). Consider an extension (L,M) of LM (p) by itself in 
the abelian category SH rs [xn By the very nature of the abelian category 


structure of SH 3 pr (L€., the construction of kernels and cokernels), it 
follows that M must be free of rank 4 as an O/A"-module and L must be 
free of rank 2 as an O/A"-module. Now choose an abstract rank 4 free 
O/X"-module M with a chosen basis m,,™m2,m3,ma4 and we set L to be 
the submodule spanned by mz and mg. We fix a short exact sequence of 
O/A"-modules 
0 M(p) + M+ M(p) 0 

determined by j(e1) =m, j(e2) = m2 and A(m3) = e1, h(m4) = eg. Our 
problem is to count the number of ways (up to equivalence respecting L) 
we can impose a D-module structure on M compatible with the D-module 
structure on M(p) via j and kh. The main point is that since LM(p) is an 
object in SH - pany any such D-module structure on (L, M) structure would 


have to make (L,M) an object in SH 5A jx» (and so the resulting sequence 


would be a short exact sequence in SH 4 / sa) We will check this below. 

We will only consider the case in which p is reducible, since otherwise we 
have seen in §3 that Theorem 5.1 follows from Theorem 3.5, whose proof 
we will finish later. Since p is reducible, by Theorem 1.8(2) there exist 
continuous unramified characters x; : Dp, — (O/X”)* and a short exact 
sequence of (O/A”)[Dp|-modules 


OS ep xa 0. 


Lemma 4.2 and the contravariance of LM enable us to modify our choice 
of e, so that (0,(O/A”)e1) is a subobject of LM(p), corresponding to the 
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unramified quotient x2. In addition, we have F'(e:) = ve; and V(e,) = 
v—'pe, for some v € (O/A")*. 

Set V(e2) = ae, + weg and F(e2) = be; + ceg. Since the subrepresenta- 
tion ex, of p has an unramified Cartier dual, it follows that u € (O/A")*. 
Finally, the conditions FV = VF = p force c = u‘p and a = —u~' ub, so 
with respect to the ordered basis {e1,¢€2} of M(p), we have matrices 


vb vp —vu7tub 
Fu(p) = ( a. Via(p) = ( 0 i ). 


Since we can write the actions of fF’ and V on JM in the ‘block matrix’ form 
Puc ) xX ) Cs ) —Y ) 
Fu = p , Vu = p ‘ 
ae ( 0 Fue) a 0 Vo) 
the conditions FV = VF = p yield the matrix equations 
XVu(o) = Furey) Vaio) X = Y Fup). 


We claim that with such data, (Z,M) will necessarily be an object in 
SHS yn Since V|z,,) is injective, V|z is injective. In order to deduce that 
L/p — M/FM is an isomorphism from the corresponding fact for L(p) 
and M(p), the crux of the argument is to check that the sequence 


0 + M(p)/F(M(p)) > M/F(M) — M(p)/F(M(p)) > 0 


is actually exact on the left. In terms of the explicit basis for M(p), this 
reduces to the statement that if m € M/(p) is of the form X(m’) with 
Fry(p)(m’) = 0, then m € Fry(,)(M(p)). Using length considerations, it is 
a straightforward consequence of the axioms for a finite Honda system that 
the sequence 


0 M(p)/V(M(p)) > M(p)/p > M(p)/F(M(p)) > 0 


is exact (and not just right exact). Thus, F'yy(,)(m’) = 0 implies that 
m = Vut(p)(™o), sO 


m = X(m’) = XVis(p)(™o) = Frapy ¥ (mo) € Fr(p)(M(0)), 


as desired. 
Define the O/A"-module 


E(p) = {(X,Y) € M2(O/X”) x M2(O/A") | 
XVu(p) = Facey ¥: Vip) X = ¥ Fu(py}- 
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Inside of here is a submodule of ‘commutators’ 
C(e) = {([Fiu(p), Al, -[Via(p), Al) | A = (aig) € Ma (O/AX"), ai2 = 0} 
(the a12 = 0 condition corresponds to preserving the line L(p) = (O/A” eo 


inside of M(p)). It follows from [19, Ch III] that we have an O/A"-module 
isomorphism 


Extg,1 _(LM(p), LM(p)) = E(p)/C(p). 
O7/an 


We now will determine the O/A”-module structures of E(p) and C(p). 
Choose X = (2;;) and Y = (yi) in Me(O/A”). The condition that 
(X,Y) € E(p) is easily checked to be equivalent to the simultaneous con- 
straints 


gq. =u ‘vyo1, Zag = u | (bya, +u ‘pyes), yr =u (ut prir — bya1), 


and 
y12 = yt (—v tuba, + UZ\2 — byo2), 


with £11, 212, yo1, and yg arbitrarily chosen in O/A”. Thus, 
E(p) = (0/97) ®%. 


In order to determine the O/A"-module structure of C(p), we note that 
there is an obvious surjection 


q: {A € M2(O/A”) | a2 = 0} + C(e) 


and the kernel is identified (as an O/X”"-module) with all O/A"-module 
endomorphisms of M(p) which stabilize L(p) and commute with the actions 
of F and V — in other words, we have an O-linear isomorphism 


ker(q) ~ Endoys (LM(p)) ~ H® (Dp, ad(p)), 


where the second isomorphism is O-linear because of the linearity proper- 
ties of the functor in Corollary 4.6. It is straightforward to compute that 
for A = (ai;) € Mo(O/A”) with aj2 = 0, A € ker(q) if and only if ag, = 0 
and b(a11 — ag2) = 0, so 

H° (Dp, ad(p)) = (O/A") & (O/A*)[8). 


Thus, we obtain 


C(p) = (O/A")**/((O/A") ® (O/A")[B]) = (O/A") @ (b- (O/A")) 
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and 


H° (Dp, ad’ (p)) ~ (O/X")[8] = (O/A")/0. 
Since 


E(p)/C(p) = (O/d")®? @ (O/X")/b, 


wearedone. & 

We now complete the proof of Theorem 3.5 (which requires 6 to have 
trivial centralizer). We will handle the residually reducible and irreducible 
cases separately. First, we need to check that | Ext yf (LM(p), LM(p))| = 

k 
|k|? in order to obtain a surjection 


TT OT, DB] > Be 


In the residually reducible case, this is just the calculation in the proof of 
Theorem 5.1, with n = 1. Now consider the case in which 9 is irreducible. 
By Theorem 1.8(2), p|z, is self-dual. In particular, the closed fiber of G(p) 
is connected with a connected dual, so by Lemma 4.2, F' and V act ina 
nilpotent manner on M(p). Thus, ker(V) # 0, so we have a natural k-linear 
isomorphism 

ker(V) ® L(p) ~ M(p), 


giving two natural lines in M(p). Let L(p) = keg and ker(V) = key, so 
with respect to this basis, we get the matrix 


0 6b 
Vu) = (; ae 


with b 40. The condition FV = VF = p=0 on M(f) yields 


0a 
Fu = (4 ar 


for some @ € k. Applying the above reasoning to the irreducible flat ‘con- 
nected’ dual p*, we see that a 0 by Lemma 4.2. 

Thus, V = ¢F, with ¢ = a@—1b € k*. It won’t matter for us what the 
value of € € k* is, but we mention for completeness that ¢ is determined 
by the unramified character 


w! detp: Dy > k*, 


with ¢ = —1 when det p = w (see [7, Lemma 6.1] for more details). Defining 
E(p) and C(p) as in the proof of Theorem 5.1, we compute that E(p) is 
4-dimensional over k, while C'(p) is 2-dimensional over k, so 


Ext 55, (LM (p), LM(p)) ~ E(p)/C(p) 
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is 2-dimensional over k, as desired. 

As we saw in the sketch of the proof of Theorem 3.5 earlier, it remains 
to check that 

|DEO/A*)| = |KO. 

In fact, the surjection m already gives us <, so it is enough to simply 
construct |k|?("-)) distinct flat deformations of p to GL2(O/A"). First 
consider the residually irreducible case. We will list |k|?("—)) objects X = 
(Lx, Mx) in SH5,,, with Mx free of rank 2 over O/A" and 


X[d] & LM(p) 


in .SHZ {recall that LM is contravariant). Fix a € O/X" lifting a € k* 
(where @ is defined via a matrix for Fyy(z) as above). Choose any a,f € 
O/A" with a = 0 mod X and B mod A = b (recall 6 € k* from above). 
Define Mx to be free with basis e), eg and define Ly = (O/A”)eo. Also, 


define ‘ 8 
a a a 
Fux = Gee ’ Vaux = ee i) . 


It is easy to check that for each of the |k|?("—)) different choices of (a, 6), 
the corresponding (Lx, Mx) is an object in SH A pn and that different pairs 
(a, 8) give rise to non-isomorphic flat deformations of p of the desired type. 
When O = W(k), [27] gives a non-explicit direct proof that | DE(O/ A”)| = 
|k|?("-)) for @ irreducible or reducible as in Theorem 3.5. 

Before finishing off the residually reducible case in Theorem 3.5 (with 
trivial centralizer), note that what we have done so far completes the proof 
of Lemma 3.4 in all cases, which is what is needed in Wiles’ method. 

Back to the reducible flat p in Theorem 3.5. By the argument used 
to prove Theorem 5.1, we can choose a k-basis {e€1,¢2} for M(p) with 
L(p) = keg and 


ob 0 —v~ lub 
Puc = (5 ae Vacs = (9 t ge 


with 7,0 € k* and b€k. Fix b € O/A" lifting 6. For each of the |k|?*—) 
pairs (u,v) with u,v € O/A” lifting @ and U respectively, define 


v b vip —vu7lub 
Fu = (5 ip): Yie= ("9? u ). 


It is easy to check that in this way, (L,M) acquires the structure of an 
object in SH : /xn with A-torsion isomorphic to LM(p), so the correspond- 
ing (O/X")[D,|-module gives rise to a flat deformation of f to GL2(O/A”). 
Moreover, different pairs (u,v) are readily checked to give rise to non- 
isomorphic deformations. This concludes the proof of Theorem 3.5. 
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HECKE RINGS AND UNIVERSAL DEFORMATION 
RINGS 


EHUD DE SHALIT 


1. INTRODUCTION 


Wiles’ proof of the Shimura-Taniyama-Weil conjecture for semi-stable 
elliptic curves is based on the “modularity” of certain universal deformation 
rings. 

Fix an odd irreducible representation 


(1.1) p: Go — GLo(k) 


from the absolute Galois group of Q to the group of 2 x 2 invertible matrices 
over a finite field k, and a deformation type D (see section 2 for precise 
definitions). One constructs then a certain complete noetherian local ring, 
the universal deformation ring Rp = Rp(pf), and a universal deformation 
pany : Gg — GL2(Rp), whose specializations give all the deformations of 
p of type D, up to strict equivalence (see [M2], [M3]). Here it is implicitly 
assumed that p itself is of type D, to begin with. 

If we assume in addition that p is modular (in the sense that it comes 
from reduction mod X of the A-adic representation associated to some cusp 
form, see 2.1 below), one can also associate to D another complete noe- 
therian local ring, the Hecke algebra T’p, and a canonical homomorphism 
yp : Rp — Tp of local rings. Tp is the (p-adic completion of the) Hecke 
algebra acting on all the modular forms whose associated A-adic represen- 
tation is of type D, and lifts 6. The modularity of p is needed to assure 
that there is at least one such form. The homomorphism yp is derived 
from the universality of (Rp, p¥™'’). Similarly, any deformation p of f with 
values in a complete local ring R defines a homomorphism y: Rp — R 
bringing p28?’ to p, and p is called modular if and only if y factors through 
(ep. The assertion that every deformation of type D is modular is therefore 
equivalent to the assertion that wp is an isomorphism. In this set-up the 
main theorem to be proved is the following. 


Theorem 1. Assume that D is a minimal deformation type. Then (i) 
Yp 1s an isomorphism (ii) Tp is a local complete intersection (L.c.i.). 


We shall follow the proof of this theorem given by R. Taylor and A. 
Wiles in the appendix to their paper’. The proof gives (i), and the complete 


1Qur exposition benefitted at several points also from the excellent survey paper by 
Darmon, Diamond and Taylor [D-D-T]. 
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intersection property (ii) comes as a by-product, rather than a prerequisite, 
to (i). In the earlier proof of (i), given by Wiles in chapter 3 of [W], (ii) 
had to be known in advance, and its independent proof made up the main 
body of [T-W]. It was observed by Faltings that a slight modification of 
the arguments of Taylor and Wiles yields both (i) and (ii) simultaneously, 
and this observation was incorporated into the appendix of [T-W]. 

The minimality condition on D (see remark 2 in section 2 below for 
a precise definition) is essential for the proof to work. It means, roughly 
speaking, that a deformation of type D has “no more ramification than 
what p forces it to have.” While the main theorem remains true without 
the minimality assumption, different ideas are needed to pass from minimal 
to non-minimal D. Very roughly, one measures by how much Rp and 
Tp change when D is modified, and proves that the equality Rp = Tp 
propagates from the minimal deformation type to any D. The Gorenstein- 
ness of Tp plays a crucial role in establishing a tool to measure the “change 
in Tp.” The passage from the minimal case to the general case is treated 
in Ribet’s article in this volume [Ri2]. 

We shall not work in greatest generality. For example, we shall assume 
throughout that the determinant of p is the cyclotomic character, and at 
| #p (pis the characteristic of k) we shall assume that D is of “type A” in 
Wiles’ terminology (Wiles himself considered types “B” and “C” as well, 
and Diamond [Di2] [Di3] completed the picture by allowing the restriction 
of p to the decomposition group at / to be arbitrary). However, this will 
be enough for the application to the Shimura-Taniyama-Weil conjecture, 
in the minimal case. 

Before we turn to a detailed outline of the proof of theorem 1, let us 
explain how it implies the Shimura-Taniyama-Weil conjecture for semi- 
stable elliptic curves. By abuse of language one calls an elliptic curve EF 
over Q semi-stable if it has good or semi-stable (multiplicative) reduction 
everywhere. This is equivalent to the fact that the conductor of F is a 
square-free integer. Let E be a semi-stable elliptic curve defined over Q, 
take p = 3, and assume that p = Pr3, the representation of Gg on the 
3-division points, is irreducible. (The reducible case is handled via a trick 
of Wiles which involves fz,5 as well. See chapter 5 of [W] and [Ru].) Then 
in fact p satisfies all the technical conditions listed below, in section 2.1. 
Let us quickly check them. 

First, GL2(F3) can be lifted to a subgroup of GL2(A) for some ring of 
integers A in a number field in which 3 splits completely, and therefore 6 
can be viewed as a complex representation. Since PGLo(F3) = Sua, p is 
modular by the Langlands-Tunnell theorem. Second, the determinant of 
p is the cyclotomic character mod 3 thanks to the Weil pairing. Third, p 
remains absolutely irreducible when restricted to the absolute Galois group 
of L = Q(./—3). If A|Gz, were reducible, it would have two invariant lines 
(in F2), for if it had only one such line, that line would be invariant under 
Gg as well. Thus f(G_) is a torus, the splitting field of f is an abelian 
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extension M of L, and [M : L] is relatively prime to 3. Since F is semi- 
stable, M/L would be unramified outside 3. Class field theory tells us that 
there are no abelian extensions of L of degree relatively prime to 3 which 
are unramified outside 3, so M = L, contradicting the irreducibility of /. 
Finally, 6 is “type A” at every | #4 3 and “Selmer” or “flat” at 3 by the 
semi-stability of EF, as follows from the l-adic analytic model of EF as a Tate 
curve (see below). 

For any p, P = PE p, the representation of Gg on the p-adic Tate module 
of E, is a deformation of p = fzp, the representation on the p-division 
points. It is a minimal deformation if and only if for every prime of bad 
reduction (including possibly p) the order of the minimal discriminant Ag 
of FE at that prime is not divisible by p. This follows easily from the Tate 
parametrization of F at the bad primes. At a prime / of split multiplicative 
reduction one has E(Q;) = Q*/ (qz,1) as a Galois module. In particular 
one has a short exact sequence of Galois modules 


(1.2) 0 — pp > Elp] + Z/pZ 0 


and the splitting field of E[p! is Qi(up, qh ;). Ifthe order of Ag, hence also 
of the Tate period gz, is divisible by p, f is unramified (if | 4 p) or flat 
(if 1 = p), but p is ramified or non-flat (resp.). On the other hand if the 
order of gz is not divisible by p, already p is ramified (if 1 4 p) or Selmer 
non-flat (if | =p). If 1 is a prime of non-split multiplicative reduction then 
the above analysis applies to the unramified quadratic twist of E. Since 
the notion of being unramified or flat is invariant under unramified twists, 
the same conclusion holds. 

Now let D be the minimal deformation type described in section 2.2 
below, and assume that the order of Ag at primes of bad reduction is not 
divisible by 3. Then p = pg3 is a deformation of type D of p. The main 
theorem therefore implies that p factors through Tp, so there exists a ring 
homomorphism h : Tp — Zs, such that hoyp brings p37” to p. But h de- 
fines a Z3-valued weight-2 newform with trivial nebentypus, and for all but 
finitely many primes 1, yp(tr(p37’(Frob;))) = Ty is the I-Hecke operator. 
It follows that tr(e(Frob;)) = h(Th). Since also det(p(Frob;)) = 1, and p is 
irreducible, p is the representation associated to the modular form h. If 
the level of h is N then the Isogeny Theorem (due in this case to Serre and 
in general to Faltings) implies that there exists a non-constant morphism 
from the modular curve Xo(NV) to &. However, for many applications, 
such as the analytic continuation and functional equation of L(E/Q, s), it 
is enough to know that p is associated to a modular form. 


Example 1. Take for E the curve y?+zy = x° —z?—z, p = 3, and p = Dr 3. 
Then 


@ Im(6) = GL2(F3) (exercise !), det(p) = w 
@ pis flat non-ordinary at 3 (& has good, supersingular reduction there) 
@ pis semi-stable ramified at 73 (E is a Tate curve at 73 and Ag = 73) 


424 E. DE SHALIT 


e F has good reduction outside 73. 


The first point implies the irreducibility of p. The last two points imply 
that pz3 is a minimal deformation of p. Thus the main theorem applies 
as it stands, and proves that & is modular. Indeed, it is the curve labelled 
73A in the Antwerp tables. 


2. AN OUTLINE OF THE PROOF 


2.1. Set-up. Let p > 3 be a prime number, fk a finite field of characteris- 
tic p, and p : Gg — GI2(k) an odd irreducible continuous representation. 
Assume 


@ pis modular — there exist (a) a newform f of weight x, level N and 
nebentypus ~ (for some &, N and yw), (b) a prime A in the field K; 
generated over Q by the Fourier coefficients of f, dividing p, (c) an 
embedding of Ox,/A in k, such that for every | not dividing N, | is 
unramified at | and 


(2.1) det(X — p(Frob;)) = X? —a(f)X + ¥(DIS1 mod A 
@ det(p) =w, the cyclotomic character mod p 


@ p|Gz is absolutely irreducible, where L = Q ( (+3) P) 


P 
mea * d x i ified, for 1 G, is th 
p|Gi . and x is unramified, for 1 # p (G; is the 


decomposition group at 1) 
@ p|G,p is either flat — it is the Galois module attached to the generic 
fiber of a finite flat k-vector group scheme over Zp, or it is not flat 


-1 
but Selmer — p|Gp ~ ay for an unramified w. (In such a 


») 
v 
case, the * has to be “trés ramifié” in the language of [Se], otherwise 
p|Gp would be both Selmer and flat.) 


Remark 1. (i) N(p), the prime-to-p Artin conductor of p, is square free, 
and p is ramified at p too. : 

(ii) In the Selmer non-flat case (at p), and in the ramified case (at 
| # p), the wnramified character w (resp. x) is trivial or quadratic. This 
observation is due to Diamond (([Dil], 6.1 and 6.2). 


For example, at 1 4 p, the proof of (ii) goes as follows: p|G; factors 
through a two-step solvable extension of Q;, with an unramified quotient 
and a tamely ramified submodule which is isomorphic to k. The unrami- 
fied quotient acts by conjugation on the ramified submodule through the 
character wy~?. On the other hand, we know from the structure theory 
of tamely ramified extensions, that this action should be through w, hence 
aoe 
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2.2.- Deformations. Let © be the ring of integers in a finite extension of 
Q@, such that, denoting by » the prime ideal of O, O/A = k. Let & be 
the set of primes where f is ramified (p € X). The key technical point 
in the proof of theorem 1 is the introduction of an auziliary set of primes 
Q = {a,---, gr} satisfying: 

2 QNX= 4, for all gE Q with g=1 mod p. 

@ For all g € Q, A(Frob,) has distinct eigenvalues {a,,8,} contained in 

ie 

If k is too small to contain the eigenvalues of some Frobenius, replace 
it by its quadratic extension, and change O and X accordingly. More as- 
sumptions on Q will be imposed along the way. The set Q will vary, but 
ultimately our interest lies in Q = 0. By abuse of notation we shall write Q 
also for the product of the primes in Q. Let Qs (where S is a set of primes) 
be the maximal extension of Q which is unramified outside the primes in S, 
and Gg = Gal(Qs/Q). We are now ready to define the deformation type. 


Definition 1. A deformation of type Dg of f is a continuous represen- 
tation p : Gpug — GL2(R), where R is a local complete noetherian O- 
algebra, with maximal ideal mr and residue field R/mp = k, satisfying 


°® p mod mg =p 
@ det(p) = «, the cyclotomic character 
a 

@ p|Gi ~ ( oe 
A”) 

e If p|G, is flat, so is p— meaning that for every ideal J C R with R/I 
finite, p mod J is the Galois module attached to the generic fiber of 
a finite flat group scheme over Z,, endowed with an action of R/I 
making it free of rank 2 over R/T. 


ai wy x 
If p is not flat but Selmer then p|G, ~ i for an 


and x is unramified, for! € 4,14 p (pis “type 


unramified w. 

Remark 2. (i) As above, in the Selmer non-flat case (at p), and in the 
ramified case (at 1 € &,1 4 p), the unramified character w~ (resp. x) is 
trivial or quadratic. Since p # 2, it follows that it is the same character as 
the one figuring in p. 

(ii) When Q = 9, D (= Dg) is minimal: if p is unramified (at 1 4 p) or 
flat (at p), so is p. 

Although no condition is imposed at q € Q, we have 


Lemma 2. ([T-W], appendix, lemma 7) Ifq€Q, then 


AlG,~( e 


Proof: It is enough to prove the lemma when R is Artinian. Since 6 
is unramified at g, and R is a p-adic ring, p(I,) is a pro-p group. Since 
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q# Pp, p is tamely ramified at g. Let f and ¢ be topological generators of 
Gal(Qi"/Q,) such that f restricts to the Frobenius automorphism on the 
maximal unramified extension Q7" of Q,, and ¢ fixes Q,'- Using the fact 


that ag ~# Gq choose a basis for the space of p in which p(f) = ( i i ) 

is diagonal. Since f is unramified at g, p(t) = 1modmg. Now suppose 

that p(t) = ( = F ) (1+.N) = M(1+N) and N =0 mod mj (n > 1). 
2 


we have 


Since M and N commute modulo m3"? 


q 
p(t)? = ( Ay ug ) aan) mod mz*t 


Using the relation ftf~! = ¢% one gets 


Co) a orn a) 
= ( a uf )atan) mod mt} 


which implies (since g = 1 mod p) that N is diagonal modm’*!, and the 
desired result follows by induction on the nilpotency degree of ma. O 


Definition 2. Let A, be the p-Sylow subgroup of (Z/qZ)*, and let x, : 
Gg — A, be the composition of the cyclotomic character mod g and the 
projection from (Z/qZ)* to A,. (We call x, the nebentypus character of 
conductor q.) Further, let 


Ag= |] 4a Ag = O[Ag] (a local zing), and xq= || xe. 
qeQ eo 
Let us also distinguish between ¢, and ¢2 using the convention that ¢, mod 
my sends Frob, to ag (this is possible since a, # ). 


Corollary 3. ¢1|Iq = (¢2|Iq)~! factors through xq\Iq : d1\Iq = $1° XalLq 


for a unique character ¢, : A, > (1+ mp). 


Proof : The first equality follows simply from the fact that « = ¢1¢2 is 
unramified at g. Since ¢; mod mg is unramified too, ¢1(J,) C l1+mr. Now 
1 factors through the maximal abelian extension of Q,, which, by the local 
Kronecker-Weber theorem, is generated by roots of unity. In particular 
i|Iq factors through the cyclotomic character into Zt, and since 1 + mz 
is a pro-p group, through the cyclotomic character mod gq, and eventually 
through xq. O 

The local conditions defining the deformation type Dg are “conditions” 
in the technical sense of [M3]. It follows from the irreducibility of p that 
the deformation problem is representable. Thus there exist a universal 
deformation ring Rg, and a universal deformation py’ : Gg > GL2(Ra): 
such that every deformation p : Grug — GL2(R) of type Dg is strictly 
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equivalent to a unique specialization of pert under a unique homomorphism 
Rg — R. 

Next, apply the above discussion, on the shape of p|J,, to the universal 
deformation pa’. Denote the 4, of the corollary in this case by univ, We 
shall give Rg the structure of a Ag-algebra by mapping Ag to it via the 
homomorphism (¢2"”)? (the reason for the square will become clear soon; 
note that Rg is already an O-algebra). Let ag be the augmentation ideal 
in Ag. Then from lemma 2 and corollary 3, pg” mod ag Rg is unramified 
at the q’s dividing Q, hence it is of type D. The universality of R(= Rg) 
implies that there is a unique homomorphism R — Rg/agRg bringing 
pg’ to pg"” mod agRg. On the other hand pj" is clearly of type Dg, 
so the universality of Rg implies that there is a unique homomorphism 
Rg — R bringing pg” to pg". We conclude that Rg/agRg can be 
canonically identified with J2. 


2.3. The Hecke ring. Put N = N(p) if p|G,p is flat, and N = N(p)p 
if p|Gp is non-flat but Selmer. (This will turn out to be the minimal 
level at which one can find a weight 2 newform whose associated Galois 
representation lifts p.) Let 


a 6b the order of d mod Q in 
(2.2) 1e= {( ca ) = Fo(vQ) | (Z/QZ)* is prime to p } 


so that [9(NQ)/Ta9 = Ag. Let So(T'g,O) be the space of O-valued weight- 
2 cusp-forms on I'g, and let T(T'g) be the subalgebra of End(S2(fg, O)) 
generated over O by the Hecke operators T; and (I) for primes / not dividing 
NQ, and U, for l|NQ. Since (/) depends only on the image of / in Ag, the 
diamond operators make T'(I'g) a Ag-algebra. 

Let mg be the ideal of T(I'g) generated by A, T; — tr(p(Frob;)) and 
II) —det(p(Frob;)) for (1, NQ) = 1, U;—x(Frob;) for t|N (6), Up —7(Frob,) 
if p is Selmer non-flat, and U,—G, for q|Q. Here y and w are the unramified 
characters figuring in section 2.1, and @, is the eigenvalue of f(Frob,) which 
coincides with ¢3"'’(Frob,) (recall that we chose ¢3™" to define the action 
of Ag on Rg). The expressions of the form X — 2, where X € T(I'g) and 
z € k, are shorthand for X —Z, where Z is a lifting of zr to O. Since A € mg, 
it does not matter which lifting we choose. If mg is a proper ideal (ie., 
not equal to the whole ring), it is clearly maximal, and the homomorphism 
from T(I'g) to k defined by it is a k-valued eigenform, whose associated 
representation is p. 

The following theorem is very deep and contains Ribet’s theorem on 
“lowering the level” [Ril], as well as improvements due to Carayol, Gross, 
Coleman-Voloch, Edixhoven, Wiles and Diamond (although under the cir- 
cumstances considered here, some of these names may be dispensable). 
(See [Dil] theorem 6.4 and [EG]). 


Theorem 4. mg is a proper mazimal ideal in T(T'g). 
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Note that mg, if proper, contains (at least one) minimal prime, which 
corresponds to a weight 2 newform on Ig whose associated Galois repre- 
sentation lifts 6. Thus the theorem is equivalent to the statement that p 
is modular of weight 2 and the minimal possible level. Note that this is a 
variant of Serre’s “Conjecture €,’ because Serre stipulated a level which 
was always prime-to-p. In the Selmer non-flat case considered here (where 
p|N) Serre would “pay” for the omission of p from the level in allowing the 
newform to be of weight p+ 1 rather than 2. Note also that here, and only 
here, is the crucial hypothesis that f is modular (of some weight and level), 
being used. Sketching the proof of this theorem would take us outside the 
scope of our survey. Let us indicate only how it follows, in the form stated 
here, from {Dit],-theorem-6.4. 

Consider first the case Q = 9. It then follows from [Dil], theorem 6.4, 
that there is a newform f of weight 2 and level N, a prime 4’ dividing p in 
the field K;, and an embedding of Ox,/X’ in k such that for every prime 
! not dividing Np, 


ai(f) = tr(A(Frob;)) mod 2’. 


Parts (2) and (3) of theorem 6 below, describing the restriction of the A’- 
adic representation associated to f to the decomposition groups at bad 1 
(those dividing N(p)) or at 1 = p, imply that a;(f) = x(Frob;) mod ’ for 
I|N(p), and ap(f) = (Frobp) mod »’ in the Selmer non-flat case. The 
nebentypus of f has order prime to p, so can be read from the determinant 
of 6 = pz,» mod A’. But det(p) = w, and the weight is 2, so the nebentypus 
is trivial, and f is on I'9(N). The kernel of the homomorphism of the Hecke 
algebra (into k) defined by f mod X’ is the desired maximal ideal. 

When Q is not empty, one resorts to the theory of old-forms. For sim- 
plicity assume that Q = {q} consists of a single prime (the general case 
is handled similarly). If a and 6 are the two eigenvalues of p+ y(Frob,), 
then a8 = g anda+f6=a,(f). It follows that in the two-dimensional 
space of old-forms spanned by f(z) = f(z) and fo(z) = f(qz), the matrix 
of U, is a . ), and there exists a unique linear combination g of fi 
and fa satisfying U,g = Gg. This g is an old eigenform on I'p(NQ), hence 
on I'g, and defines a homomorphism from T'(I'g) to k whose kernel is the 
desired mg. This completes the proof. An important point to bear in mind 
is that since g = 1 mod p, there are other newforms on I'g whose 4-adic 
representation lifts p, i.e., which are congruent to g. These newforms have 
non-trivial nebentypus factoring through Ag. This will become clear once 
we prove, in section 3, the main theorem about the structure of Tg. i) 


Definition 3. Let Tg be the localization of T(['g) in mg. Let Xg be 


the modular curve over Q corresponding to the congruence group I'g, and 
Jg = Jac(XQ). 
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The Hecke operators act as correspondences on Xg, and as endomor- 
phisms on Jg. Let Tap,(Jg) be the p-adic Tate module of Jg, which be- 
comes a T'(I‘g)-module after we tensor it over Z, with O. We can there- 
fore localize it at mg. The resulting module, Ta,(Jg)mg; is a Tg -module. 
Thanks to the assumption that the residual representation of mg, namely 
p, is irreducible, we know the following result, which again is very deep, 
and is related to the Gorenstein-ness of Tg (see [M1] and [Ti]). 


Theorem 5. Tap(Jq)mg 1s free of rank 2 over Tg. 0 


Definition 4. Let p = p§°° : Gg — GL2(Tg) be the Galois representa- 
tion on Tap(Jq)mg- 


Remark 3. It is easy to see that Tap(Jq)mg @z, Qp is free of rank 2 over 
Tq ®z, Qp. One therefore obtains p, but with entries in Tg @z, Qp, by 
“sluing” the p-adic representations associated to the individual newforms. 
From there it is possible to find p with entries in Tg by the method of 
“pseudo-representations” (see the discussion following [W], 2.1). However, 
theorem 5 will be needed later again. 


The first part of the next theorem is “classical” — it is the Eichler- 
Shimura relation. Parts (2) and (3) are more recent. 


Theorem 6. (1) (Eichler-Shimura-Igusa) p is unramified outside HUQ, 
and for (l,NQ) = 1, 1 # p, the characteristic polynomial of p(Frob;) is 
X* —T)X + (D1. 

(2) (Carayol, following Deligne and Langlands) For I|N(p) 


= 
G a xX XQE * ) 
plGi ( 0 x 
with an unramified character x, x(Frob:) = Ui, and x? = xg. The * in the 
upper right corner is ramified. 


For q|Q 
~lyoe x 
plGy es ( ? ie x ) 


with an unramified character x, x(Frob,g) = Ug. 
(3) (Unpublished correspondence of Fontaine-Serre, Wiles) If N(p)=N, 
p is flat at p. If, on the other hand, p|N, then 


_( wixee * 
plGp ( 0 w 
with an unramified character p, and (Frobp) = Up. O 


In case (2), the automorphic representation of GL2(Ag) associated to 
any eigenform of level NQ whose A-adic representation lifts p, is “special” 
at I|N(p), and “principal series” at g|Q. 
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Corollary 7. The representation po = pars ® ae * is of type Dg, and 
there exists a unique surjective homomorphism of Ag-algebras Rg — Tg, 
bringing po” to po: 

Proof: After twisting, det(oQ) = ¢. Parts (2) and (3) of the theorem 
now imply that po is of type Dg. By the universality of Rg we obtain the 
desired map to Tg. Since this map brings ¢3™"|I, to ./xqlJq; it respects 
the A, action (this finally justifies the peculiar square in the definition of 
the A,-action on Rg). It remains to prove that the map is surjective, or 
that every Hecke operator is contained in the image. That 7) and (I) are 
in the image ((1, NQ) = 1) follows from the relations tr(o§°*(Frob;)) = 
Ti, and det(pG°° (Frob;)) = I(I). Parts (2) and (3) of the theorem show 
directly that U% (I|N(p)), Uz (q|Q) and U, (if f is Selmer non-flat) are in 
the image. O 

When Q = @ we drop the subscript and write R (resp. T) for Rg (resp. 
Tq). The following is the main theorem (theorem 1). 


Theorem 8. The map R — T is an tsomorphism, and T is a l.c.t. 


24. The Taylor-Wiles-Faltings criterion. The following commutative- 
algebra criterion lies at the basis of the proof of the main theorem. See 
[T-W], appendix, or [dSRS]?. 


Lemma 9. Letéd: R — T be a surjective homomorphism of local complete 
noetherian O-algebras, and assume that T is finite and flat over O. Suppose 
that for some r > 1 and for every n > 1 there exist local complete noether- 
ian O-algebras Rg and Tg and a commutative diagram 

O[[Si,...,5,]] — Rg —- R 
(2.3) | 1 

To ~ T 

where all four maps in the square on the right of the diagram are surjective, 
and 
(i) (S1,...,5,)Rgq = Ker(Rg — R) 
(ii) (Si,...,5,)Zq = Ker(Tg - T) 
(iii) 6b = Ker(O[[Si, ..., 5-]] ~ Tg) C(1 +51)” —1,...,(1+S,)?" —1), 
and Tg is free of finite rank over O[[Si, ...,S;]]/6- 
(iv) Rg is topologically generated as an O-algebra by r elements. 
Then R — T is an isomorphism, and they are l.c.i. O 


One should note that there are two types of assumptions here. Points 
(ii)-(iii) mean that the {Tg} are large and “regularly controlled” by r 
parameters, namely the S;’s. As n increases these parameters consume an 
increasingly significant part of Tg. The subscript Q here has no meaning, 
and simply hints for the upcoming application. The Rg and Tg are not 


*Recent improvements of the criterion by Rubin and Schoof eliminate the need to 
pass to Reo and Too (see below). Here we stick to the original presentation. 
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canonical, and are not assumed even to relate to each other as we vary n. 
Indeed, the first step in the proof of the lemma is to show, by a “Mittag- 
Leffler” argument (after we reduce the picture mod A), that it is possible 
to make an inverse system of diagrams as above, and pass to a limit 


Ol[Si,...,5-]] —~ Ro —» R 


{ { 
fe a 


where T., is now finite free over O[[Sj, ..., S,]]. 

On the other hand, point (iv) means that the Rg are uniformly small, 
so in the limit diagram R. will still be generated topologically as an 
O-algebra by r elements. Considerations of Krull dimension then yield 
Ro = To = Ol[X,...,X-]], from which the desired equality R = T = 
O[[Xq, ---; Xr]]/(S1, ..-S,) is finally deduced. 

To apply the lemma Taylor and Wiles find an r depending only on f, and 
for every n > 1 they choose a set Q as above, containing precisely r primes, 
all congruent to 1 modulo p” (and not merely p), so that (iv) will hold. 
The Rg, the Tg and the maps between them are chosen as in the previous 
sections. The map O[[S),...,.5,]] > Rg is taken to be the map that sends 
1+ S; to a fixed generator of A,,. If we let p™ be the order of A,,, the 
p-Sylow subgroup of (Z/q;Z)* (so that n; >), then Ag is identified with 
O[S1,.-.,.S,]]/6, where b = ((1+ 9,)?"' —1,...,(1+5,)?” — 1), and the 
augmentation ideal ag is identified with (51, ...,5,)/6. Point (i) was already 
noticed at the end of section 2.2. Points (ii) and (iii) are guaranteed by 
proposition 10 below, to be proved in section 3, and by the fact that all the 
gq; = 1 mod p”. The proof of proposition 10 is essentially topological. By an 
extension of a result of Mazur, a certain piece of the (singular) cohomology 
of the modular curve, with coefficients in O, is free of rank one over Tog. 
This allows to replace the question on the structure of Tg over Ag by a 
similar question on the structure of the cohomology. 

The greatest difficulty lies in point (iv). We have to choose r and the Q’s 
so that the number r of primes in Q (which is the number of parameters 
S;) is just the minimal number of generators of Rg. This is guaranteed by 
proposition 11 below, +o be proved in section 4. The proof uses in a deep 
way Tate’s global duality theorem in Galois cohomology. As will become 
clear, this is where the assumption that the deformation type is minimal 
enters. 

We are thus left with the task of proving the following two propositions. 
Together with the commutative-algebra criterion of lemma 9, they imply 
the main theorem. 


Proposition 10. Let Q be a set of r primes as in section 2.2. Then (i) 
Tg is finite and free over Ag (ii) rankagTg = rankoT. Equivalently, 
Tg/aglg =T. 
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Proposition 11. There exists anr (namely, dim, Hi,(Gs,Symm’p), see 
section 4) such that for everyn > 1 there exists a setQ ofr primes, disjoint 
from %, satisfying 

e@ For everyqg€Q, ¢q=1 mod p” 

@ For every g € Q, p(Frob,) has distinct eigenvalues contained in k 

@ Rg can be topologically generated as an O-algebra by r elements. 


3. PROOF OF PROPOSITION 10 — ON THE STRUCTURE OF THE HECKE 
ALGEBRA® 


3.1. New and old. Let 'g = T'o(NQ), and define the Hecke algebra 
To- as in-section 2.3, but with respecte Ig instead-of Ig, emitting the 
diamond operators. In other words, we first let T(['g) be the Hecke algebra 
on S2(I'g,©) (over ©). The inclusion S2(I'9,O0) C S2(I'g,@) induces a 
surjective homomorphism T(['g) -» T(I'g). The image of the maximal 
ideal mg in T(Ig) is also a proper maximal ideal tg (cf the proof of 
theorem 4), and we set Tg = T(To)ing: Localizing the above restriction 
map between the two Hecke algebras we get a surjective homomorphism 
Tq - Tg. Note that since T(I'g) is pro-artinian, it is a product of its 
localizations at maximal ideals, and every T(I'g)-module is the direct sum 
of its localizations. 

The next theorem (see also theorem 5) was first proved, for '9(N) and 
N prime, by Mazur, and then generalized by Tilouine, Ribet, Gross, Edix- 
hoven, and Wiles (see [W], theorem 2.1, and [Ti]). It is a “multiplicity one” 
result for certain finite Hecke modules (killed by p). The Gorenstein-ness 
of the Hecke algebra, which is a consequence of the complete intersection 
property in theorem 1, is known to follow from it. However, the Gorenstein 
property itself is not directly used in the Taylor-Wiles proof of theorem 1 in 
the minimal case (although it is heavily used in the passage from minimal 
to non-minimal D). Let Yg C Xg be the open modular curve which is 
obtained from Xg when we delete the cusps. 


Theorem 12. The following modules are finite free over Tg : 
@ Tap(Jq)mg = (Tap(Je) ®z, O) @r(rg) Tq — free of rank 2 
2 H'(X9,0)5, = H*(Yo,0), (+ refers to the action of complex 
conjugation on the modular curve) — each free of rank 1. 


Similar statements hold for Tg. O 


The theorem, as well as the fact that we can replace the closed curve 
X by the open curve Y, rely on the irreducibility of f (so mg is not an 
“Hisenstein” prime). 


3We follow the proof in [TW]. F. Diamond kindly informed us that in the final version 
of [DDT] there is a new argument that directly uses the q-expansion principle instead 
of “multiplicity one.” 
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Recall that T = Ty = Ty is the localization of T(To(N)) at m = mpg. 
While there is no map from T(I'9(NQ)) to T(T'9(N)) (the U, for g|Q do not 
preserve So (To(V ),O) c So(To(NVQ), O)), the next lemma shows that after 
we localize at mg and m, such a map exists, and in fact is an isomorphism. 


Lemma 13. There exists an isomorphism Tg & T, mapping the Hecke 
operators T; (l{.N) and U, (l|N) in Tg to the corresponding operators in 
T. 


Proof: (1) In the first step one uses the fact that ag # @, and q = 
1 mod p, to show that Tap(Ja)iig is “Q-old.” By “Q-old” we mean that 
Tap(Jo)aos Which is.a_direct summand_of Tap(Jq) @z, ©, is contained in 
the Tate module of the Q-old subvariety of Jg. This Q-old subvariety is 
isogenous to a product of 2" copies of J = Jo(N). 

One can prove (1) by computing the module of fusion between the Q-old 
and Q-new parts of the Jacobian as in [W], proposition 2.4’ (p. 503, see 
the remark at the end of the section there). For simplicity let us illustrate 
the proof when Q = {gq} consists of one prime (the proof of the general 
case is the same, proceeding one g at a time). Let Jae and Sa be the 
old and new subvarieties of Jg. Thus, if p : J? + Jg is the map derived 
by Pic functoriality from the two degeneracy maps X9(Nq) — X(N), 
and ji : Jg — J? is the dual map, Jaa is the image of p, and Jpew 
is the connected component of Ker(j:). Both Jace and Jao are sabe 
under the Hecke algebra, including the U,- speak. Consider J[p°|m = 
(J[p°] @z, O) @r(r) T (recall T = T(D)m- is the localization of the Hecke 
algebra at m) and suppose that 


Tlp]2, VKer(d.0 4) = {0}. 
Then the relevant module of fusion 
F = JG" ip hig NIG™ PW lig = {0}, 
for it is contained in u(J[p™]2,) Nn Ker(ft). In particular 
JG“ [ig] Je [ita] = {0} 


(where J[m] is the kernel of m in J[p| @k). However, it follows from 
theorem 12 that dim, Jg [mg] = 2. Equivalently, the multiplicity of fin the 
semisimplification of this Galois module is 1. Since p is a constituent (i.e., 
a subquotient Galois module) of J OP lita: hence of Je ad [ma], it is not a 
constituent of an a [img] or of J ieee [p~ |g, which means cht ie OY plato = 
0, as desired. 

Clearly J[p™]?,N Ker(jt 0 4) is a finite group, and has a decomposition 
series all of whose factors are killed by m, so to prove J[p%]2, NKer(fiop) = 
{0}, it will be enough to show that J[m]? N Ker(fi oz) = 0. An explicit 
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calculation gives 


a — qt+l Tq 2 = 
fop= ( a ) € End(J?) = M2(End(J)). 


Since g = 1 mod m and Ty = ag + Gy mod m, the matrix representing fio pu 


on J[m]? is where t = ag+(q. This matrix is non-singular since 


2 ¢ 
a) 
QgBg =q= land ag # GB, imply t # +2. 

An alternative proof goes as follows. A newform g which is new at gq 
must be “special” there (because g divides the level to the first power, 
compare theorem 6 (2)). The local representation p, |G, therefore has 
the form described in theorem 6. Since g = 1 mod p, the character w is 


trivial on Gy, and we find that fyrlGy~ ( ~ : . On the other hand 


if g belongs to mg then fy, = f, and the assumption was that the two 
eigenvalues of p(Frob,) are distinct. This contradiction proves that all the 
newforms belonging to mg are Q-old. 

(2) On the Q-old forms the Hecke algebra is simply 


T(Po(N))[uql/(ug — Tottq + (9)4) 


(one variable for each q in Q). Since the roots of u2 — Tug + (q)q are 
distinct modulo mg, and since U, — Gj, is contained in mg, Hensel’s lemma 
shows that after we localize at mg we get Tp. O 


3.2. Freeness of the Hecke algebra over Ag. Let Yo be the open 
modular curve associated to Tg. As we shall prove below, H (Yo, O)~ is 
a free Ag-module of rank equal to the O-rank of H!(Yg, O)~. It follows 
from theorem 12 that Tg is a free Ag-module, whose rank is equal to the 
O-rank of To, and invoking lemma 13 we get proposition 10. 


Proposition 14. H'(Y9,O)~ is a free Ag-module of rank equal to the 
O-rank of H'(Y9,0)-. 


Proof: First, let us pretend that To had no elliptic elements. Then 
I'g = 71(YQ) is free as the fundamental group of the incomplete curve. By 
Shapiro’s lemma one may identify 


(3.1) H'(Yg,O) = H' (Tq, 0) = H (C9, O[Ag)). 


The action of Ag, the deck-transformation group of Y9/Yo, on H!(Yg,0), 
corresponds in this isomorphism to the action of [g/g on H'(Pg, 0) 
through conjugation. Via the Shapiro isomorphism this, in turn, gets trans- 
lated to the action of Ag on the coefficients of H'(I'g, O[Ag]). Since Tg 
is free, the module of 1-cocycles Z'(['9, O[Ag]) is a free O[Ag]-module. 
In fact, choosing g free generators for To, Say Y1,---)Yg, the map 


(3.2) cr— (¢(71),--- (Yq) 
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is an isomorphism of Z!(['9,O[Ag]) onto O[Ag]9, because a cocycle can 
be fixed arbitrarily on the generators, and then it is determined on any 
word. 

The action of complex conjugation on H'(Yg,Q) is deduced from the 
map z ++ —zZ on the upper half plane. If y € Tg gets mapped in ™ (Yq) to 
the path [y] which is the projection (in Yg) of the geodesic {i, y(i)} con- 
necting i to y(i) in the upper half plane, then the geodesic {—i, —7(i)} = 
{i, €yE—1(4)} projects to the path [fyé—1], which is its complex conjugate. 
Here € = ( = 1 
took the image of i. The action of complex conjugation on H'(Yo, O) 
therefore corresponds to the action of € on H (To, ©) through conjuga- 
tion. Via the Shapiro isomorphism it gets translated to an action of & 
on H 1Tg, O[Ag]), where € is still acting by conjugation on the group, 
and trivially on the coefficients. This action can be defined already at 
the level of 1-cocycles and 1-coboundaries. Clearly Z'(['g, O[Ag])7 is free 
over O[Ag], as a direct factor of a free module. About Bt(I'9, O[Ag]) 
we need not worry because the 1-coboundaries are in the + eigenspace for 
the action of € (since € ( : Jer = ( = re )) It follows that 
H} (9, O[Ag])~ is free over Ag = O[Ag], and 


(3.3) 
H* (Pg, O[Ag])" /aqH* (LQ, O[Ag])” = H'(Te,0)” = H' (Yq, 0)7 


, and as a base point for the fundamental group we 


from where the statement about the rank follows. 

Unfortunately, 'g might have elliptic elements. This is remedied by 
introducing an auxiliary prime R > 3, and replacing [g and Tg byt = 
T'gNI;(R) and We = Tg NT \(R) respectively. We contend ourselves with 
a brief sketch of how one modifies the arguments above, which now apply 
verbatim to Q% and Le. To deduce proposition 10 for the original Tg 
and Tg, we need to find a maximal ideal mg (resp. MQ) in T([Q) (resp. 
T(l'y)) such that localization at mg yields an isomorphism TG = Tg (resp. 
sae = Tg). In the language of congruences between modular forms we want 
to make sure that there are no congruences between R-new and R-old forms 
that belong to Mo- The computations of Ribet and Wiles, similar to those 
in lemma 13, show that this will be the case as long as 


@ {l,or,Br,arGr = R} are 4 distinct elements of k, where as before 
{ar, Br} are the two eigenvalues of Frobp. 


At this point there is a slight inaccuracy in [T-W], where lemma 3 of 
[D-T] is misquoted (this was pointed out to us by F. Diamond). One way 
to overcome the difficulty is to change R to R? (see [D-D-T] for details). 
However, for the purpose of proving the Shimura-Taniyama- Weil conjec- 
ture, it is enough to consider the case k = F,,p > 3. The following lemma 
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(combined with the Cebotarev density theorem) therefore serves our pur- 
pose. 

Lemma Let p: Gg — Glo(F,),p > 3 satisfy the assumptions of Section 
2.1. Then there ezists ag € Gg such that if a and G are the two eigenvalues 
of p(c), {1, a, B, eB = w(a)} are all distinct. 

Proof: If p = 3, a case-by-case analysis reveals that Im(f) contains a 
matrix of determinant —1 and trace +1 satisfying the requirements of the 
lemma. 

If p > 3 let R € F* be a primitive root mod p and choose o with 
w(o) = R. If a,@ are the two eigenvalues of f(a) and they are not in 
F,, then we are done, because they must be distinct as conjugates of each 
other. If they are in F,, again they must be distinct ‘because RF is not a 


square. Suppose {a, 3} = {1, R}. Then in a suitable basis p(c) = é 


Let L = Q(¢,) and pick p(r) = (: i) € p(G_,) with bc # 0. If this is 


not possible, (Gz) is contained in Borel subgroup, so either p(Gg) is 
reducible, or it is (generalized) dihedral, and both cases are ruled out by 
our assumptions. Conjugating f(r) by a power of p(o) we get in Im(p) 
every matrix of the form oe q ,t € FF (since R was a primitive 


root modulo p), hence Im() contains 


afl UN fa 6 a «b\_ (a?+br~'c * 
“"\0 R)\e d)\ate dj * R(d2 + cxb) 


and tr(u) = a? + bez! + R(d? + ber). Since p > 3 we can find an x such 
that tr(u) 41+ R and this finishes the proof of the lemma. O 


4. PROOF OF PROPOSITION 11 — ON THE STRUCTURE OF THE 
UNIVERSAL DEFORMATION RING 


4.1. The Selmer group. For the concepts defined in this section, see also 
Washington’s paper in this volume [Wa]. Let f be as in 2.1, and define 


(4.1) W =ad°({) (trace-0 matrices in the adjoint representation of j) 
(4.2) W* = Hom(W, yp) © W(1) © Symm?(,) 
(the last isomorphism stems from 

ad(p) = p ® pY = p® p(—1) = Symm*(A)(—1) ok, 


hence W = Symm’(f)(—1)). If f is “Selmer,” U is the space of p, and 
0—- U° —U > U! — Dis the filtration defining the Selmer condition at p 
(so that U* are modules for p|G, and U® is the w-eigenspace for J,), put 


(4.3) W° = Hom(U/U°,U®) c W! = Hom®((U,U®), (U,U°)) CW. 
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‘Thus W! is the submodule of upper-triangular, trace-0 matrices, and W® 
consists of those whose diagonal vanishes. Note that G, acts on W° through 
wp? = w (see remark 2.1(ii)), trivially on W1/W®, and through w!y)? = 
won W/W!. 

We shall next define the “local conditions,” which are subgroups L, 
of H*(G,,W) for the various decomposition groups Gy. We shall later 
consider global cohomology classes whose restriction to every G, falls into 
Ly. Since p > 3, H!(G..,W) = 0, and we only have to define these local 
conditions at the finite places. Let 

e L, = H'(Gi/h,W") if l|N(p) 

e Ly = Hi.(Gp,W) := Ker(H'(Gp,W) - H+(Ip,W/W%)) if i is 
Selmer but not flat; and L, = Hi(Gp, W) if p is flat. 

@ Ly = H'(Gq,W) if q|Q. 

Some explanations are in order. First, we did not define L, for primes 
1 ¢ SUQ, but they could be defined there too as H}(G,/,W") = 
H'(G,/l,W). Instead, we shall assume in (4.5) below that our cohomol- 
ogy classes are unramified outside }U Q, which, of course, is a synonimous 
condition. 

At 1|N(#) our local condition confirms with [W], p.461, because 


H}(G,/hh,W") = Ker(H*(G,W) — H*(G,, W/W" )). 
To check this use the exact diagram 


AG /h,W") oa AYGj,W") = BM(h,Ww") 
| | 10 
(4.4) H}(Gi/h,W*) Go H'(Gi, W) md H1(I,,W) 


{ { 
AY(G,,W/W") — HA (h,W/W") 


To justify the arrow labeled with a 0 note that it is part of the long exact 
sequence 


0 Wi WWW) as EW) Bog) 


and both (W/W“")" and H}(l,,W") = Hom(I),k) are one-dimensional 
over k. 

At the auxiliary q's we simply took L, to be the whole H', which means 
that there is no local condition imposed there. 

The most subtle condition is the one imposed at p. If p is Selmer non- 
flat the definition is analogous to that of L, for I|N(p). If p is flat, then 
H4(Gp, W) has the same meaning as in [Co]. Let us recall that there is a 
canonical isomorphism between H*(Gp, ad(f)) and Extia,] (p, p), the Ext 
group being computed in the category of k[G,|-modules. Since p|G> is 
assumed to be the representation attached to a finite flat group scheme [ 
(say), one may ask whether a specific Galois module, which is an extension 
of p by 9, in fact arises from a finite flat group scheme over Z, which is 
an extension of I by [ in the category of group schemes over Zp. This 
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need not be the case in general, and one denotes Hi(G,,ad()) those co- 
homology classes whose corresponding extension arises in such a manner. 
Then H4(Gp,W) = Hg(Gp, ad(p)) N H'(G,,W). While this definition of 
the local condition at p is appealing intuitively, it is not directly amenable 
to computations in Galois cohomology. For computational purposes it is 
important to have a “linear-algebra” criterion for an extension class to be 
“flat,” and this is supplied by Fontaine’s work on Honda systems (as ex- 
plained in [Co]), or by later developments due to Fontaine and Lafaille (as 
used in [W]). 


Definition 5. The “Selmer group” is 


(4.5) Hb, (Q,W)=Ker| H'(Goug,W)> [[ H(G.,W)/L 
veELUQ 


In other words, it is the group of cohomology classes which at every place 
v satisfy the “local condition” Ly. 


The importance of the Selmer group stems from the fact that it is canon- 
ically isomorphic to the reduced tangent space of the deformation problem. 
Recall that we denoted by Rg the universal deformation ring for deforma- 
tions of p of type Dg, and that, for notational convenience, we drop the 
subscript Q if Q is empty. 


Theorem 15. ((M2], [M3]) There is a canonical isomorphism 
(4.6) Hp(Q, W) = Hom(mp/(A, mR), k). 


A similar identity holds with Rg and Dg. In particular, Rg can be topo- 
logically generated as an O-algebra by r(Q) = dim, Hp, (Q, W) elements, 
and this is the minimal number of generators. 


Proof: The last assertion, about the number of generators needed to 
generate Rg as a topological O-algebra, is a consequence of (4.6) and 
Nakayama’s lemma. Let us prove (4.6). A k-linear homomorphism from 
mpr/(A, m2) to k defines a local O-algebra homomorphism from R to the 
ring of dual numbers k[e] (e? = 0), and vice versa. It therefore corresponds 
to p, an infinitesimal deformation of  (i.e., a deformation with values in 
GL2(k[e])), which is unique up to strict equivalence. Writing 


(4.7) p(9)p(9)~* = 1 + €c,(9), 
C,(g) becomes a 1-cocycle in W for the adjoint action, namely 
(4.8) Co(gh) = cp(g) + ad(p)(g)cp(h) 


(c,(g) has trace zero because det(p) = det(g) = w). Replacing p by a 
strictly equivalent deformation has the effect of changing c, by a cobound- 
ary, and every cohomology class is obtained in such a way from an infini- 
tesimal deformation of f. This is the construction which associates to an 
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infinitesimal deformation of p a class [c,] in H!(Q, W). It remains to check 
that. p is of type D if and only if [c,] € HZ(Q, W). 
Assume first that J|N(p), and p is of type D. Then for g € I; we have 


Cp(g) € {( : ; ) E € zh = = W":, so the image of [cp] in H*(h, w/w") 


is Q. In view of (4) (the injectivity of H!(h,W) 3 H'(h, WW) [Col 
satisfies the local condition at 1. Conversely, if this condition is satisfied 

pli is upper triangular, hence so is p|G; (I; being normal in G;), and i¢ 
follows that it has the shape of definition 2.1. 

The argument at p, in the case that p is Selmer non-flat, is entirely 
analogous, W° replacing W“'. In the flat case, the definition of Hi (Gp, W) 
makes the desired result a tautology. Finally at 1 ¢ X, p is unramified if 
and only if [c,| is unramified. 

A similar proof works when Q is not necessarily empty. O 


4.2. The dual Selmer group. By local Tate duality, at each place v, 
H'(G,,W) and H'(G,,W*) are dual abelian groups. We let L* be the 
exact annihilator of L,. At l|N(p) we have L} = H}(G,/I,,W*"), and at 
q|Q we simply have Ly = 0. The dual Selmer group is defined by 


(4.9) 


Hp, .(Q, W*) = Ker | H*(Gzuq,W*) > [] B*(Gy,W*)/Ly 
veELUQ 
The central observation concerning the Selmer group and the dual Selmer 
group is that while their orders can be pretty hard to compute, their ratios 
are expressible as a product of local terms. The next theorem is a corollary 
of Tate’s global duality (the Tate-Poitou exact sequence). See [Gr] and 
[Wa] in this volume for its derivation. 


Theorem 16. Notation as above 
#H4(Q, W) #QW) 
a) #HL.(OW) ~ #AGW) LI ew 


A similar result holds with Dg replacing D, and XUQ replacing ©. Wiles’ 
idea was that with a cleverly chosen set @ one can make Hy,,.(Q, W*) 
vanish, and so control Hy, (Q,W), hence r(Q). To this end we have to 
compute the local terms that intervene in the right hand side of the formula. 


WW) 


vELoo 


4.3. Computation of the local terms. 

o #Ly = #H°(Gp,W) -#k 

In the flat case this computation is quite delicate. The main point is 
that if p|G, is absolutely irreducible, dim Hj(Gp,ad(p)) = 2. This was 
proved by Ramakrishna in his thesis [Ra] using the theory of Fontaine- 
Lafaille (alternatively, one can use Fontaine’s theory of “Honda systems,” 
see [Co]). It follows that Lp = H3(G,,ad°(p)) is 1-dimensional, which is 
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the desired result, since if p|G, is irreducible, H°(G,,W) vanishes. See 
[Ra], [Co] or [D-D-T] (section 2.5) for full details, and a treatment of the 
case where p|G, is reducible as well. 

In the Selmer non-flat case we have the filtration 0 C W929 Cc Wi ec 
W as explained in 4.1 above, so H°(G,,W) = 0, and we have to show 
that L, = Ker(H}(Gp,W) — H}(Ip,W/W°)) is one-dimensional. The 
key point is that L, = Ker(H'(Gp,W) — H}(G,,W/W°)) (compare the 
situation at l| (6), diagram (4.4)). Denote for the moment the latter group 
by Li, C Ly. Lp is the tangent space to the (local) deformation problem 

ee : 
classifying deformations of f|G, of the form ( oy ; where w is an 
unramified character of Gp. On the other hand Lj, is the tangent space 
to the (local) deformation problem classifying deformations of p|G_ of the 
= 
form ( oy i ) where w is the character appearing in p. As explained 


in remark 2.2(i), the assumption that f is non-flat implies that in any 


deformation w = w. (In the language of [W], any Selmer deformation 
is strict, see [W], proposition 1.1, p.459, and [Dil], 6.1.) Hence the two 
tangent spaces are the same, and L, = Ly. Alternatively, one can use a 
diagram similar to (4.4) 


H*(Gp,W°) = H}(Ip, W°)Ge/In 
{ | 
(4.11) H*(Gp,W) > H(Ip, W)Ge/"e 


2(G,, W/W?) — (hy, W/W), 
Since W/» = 0, the horizontal maps on the top and in the middle are 
isomorphisms. This immediately implies L, = L,. Now compute 
H°(Gy, W/W°) = W'/W® = k, 
H* (Gp, W°) = H* (Gp, Up @ k) = Q; /Qz? Qk =k’, 

and the desired result comes out from the long exact sequence in cohomol- 
ogy associated to 0 - W° = W > W/W® — 0. 

e #L. = #H°(Go,W)/#k=1 

Here L,, = H'(Goo, W) = (0) by definition. Since the eigenvalues of 


complex conjugation on f are +1, the eigenvalues on W are {—1,—1, +1}, 
and the second equality follows. 


e #L, = #H°(G), W) for I|N(A) 
This follows from the exact sequence 
(4.12) 
0— H°(G,,W) — H°(h, W) — H°(h,W) — H'(Gi/h,wW") 0 


where the arrow in the middle is Frob, —-1:W!? = W". 
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2 H°(Q,W) = H°(Q,W*) =0 

For W this follows from the absolute irreducibility of 6, since by Schur’s 
lemma the only endomorphisms commuting with the Galois action are the 
scalars, which are missing from W. 

For W* the argument is a little different. A Galois-invariant vector in 
W* is, by (4.2), an invariant symmetric bilinear form. If this bilinear form 
is degenerate, then its kernel is invariant, contradicting the irreducibility 
of p. If it is non-degenerate, this means that the image of p is contained in 
some orthogonal group. This already contradicts the fact that det(p) = w, 
unless p = 3 (when w? = 1), but in this case it contradicts the absolute 
irreducibility of p|Gz, L as in 2.1. Indeed, over k (which is of odd char- 
acteristic) the invariant quadratic form could be taken to be the standard 
one 2? + y*, and its orthogonal group the group 


ow, 8) = { ( . Py [ereaibul(s ©, )[ere=a}. 


Then f(Gz) C SO(2,k) = {( a ; ) |@ +52 = it, which is diagonal- 
izable. 

e #H°(G,,W) = #k for gE Q 

The eigenvalues of Froby on W are ag@, = q = 1 (since g = 1 mod p), as 
and Be The latter two are not equal to 1 by our assumption that ag # Gy. 

o #H'(Gq,W) = #k 

Here, at g € Q, H'(Gq/Ig,W) = W/(Frob, — 1)W is one-dimensional 
(by the assumptions on a, and f,), 

H* (Ig, W)9a/4 = Hom(Zp(1), W)**°P2 
= W[Frob, — q] = W[Frob, — 1] = WFrs 

is again one-dimensional, and H?(G,/I,,W) = 0 since G,/I, & Z. The 
desired result follows from the inflation-restriction exact sequence. 


These computations allow us to deduce the following corollary from the- 
orem 16. 


Corollary 17. (i) r(Q) = dim, Hy, (Q, W) = dim, Hy, .(Q, W*)+#(Q). 
(ii) When we add a new prime q as above to a given set of primes Q, either 
dim, Ay, .(Q, W*) drops by 1, or dim, Ae, (Q, W) grows by 1. 0 
To prove proposition 11 it is therefore enough to prove the following: 
Proposition 18. Let r = dim, H,(Q, W*). Then for every n it is pos- 
sible to choose a set Q of r primes, disjoint from =, satisfying 
® For everyqg€ Q, g=1mod p”. 


@ For every q € Q, p(Frob,) has distinct eigenvalues in k. 
@ Hp,,(Q, Ww*) =0. 
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It will then follow from the corollary that r = r(Q), and from theorem 
4.1, that Rg can be topologically generated as an O-algebra by r elements. 
Discussion. The “miracle” occuring in the minimal case is that all the 
local terms in the product expressing the ratio #H},(Q, W)/#H3,(Q, W*) 
are 1, hence we can achieve an equality r = r(Q). In the numerical criterion 
of lemma 9 it is crucial to know that r, the number of variables $; (which 
turns out to be the number of primes in Q, see the discussion preceding 
proposition 10), is actually equal to r(Q), the minimal number of generators 
of Rg, and is not smaller. In the non-minimal case some of the local terms 
in (4.10) would be bigger than 1, and corollary 17(i) would look like 


r(Q) = dim, H}, (Q, W) = dim, H},,(Q, W*) +t + #(Q) 


for some t > 0, so we would never be able to achieve r(Q) =r. 

What remains to be checked is that we can “kill” the dual Selmer group 
by a clever choice of the set Q, while maintaining the other restrictions 
imposed on the q’s in Q. The proof of this fact, given in the next section, 
is group-theoretical in nature, based on the Cebotarev density theorem. 


5. CONCLUSION OF THE PROOF : SOME GROUP THEORY 


5.1. A reduction step. Since 


Hp, 4(Q, W*) = Ker | Hp, (Q,W*) > |] H*(Gq/Iq, W*) | , 
geQ 


it is enough to prove that for every cohomology class 0 [4] € H},(Q, W*) 
(w is a cocycle representing the cohomology class) there are infinitely many 
primes g not in © such that 

e g=1modp" 

@ the two eigenvalues of #(Frob,) are distinct 


2 res, ({p]) #0. 

We can then choose the q’s successively, each time decreasing the di- 
mension of Hp,,(Q,W*) by 1, until we annihilate it completely after r 
steps. . 

By Cebotarev’s density theorem it is enough to find a a € Gg such that 

° o|Q(Gp") = 1, 

@ the eigenvalues of f(a) are distinct, 

@ te €(o—1)W*. 

If we find such a a, let g be a prime with o|L = 22 , where L 


is a Galois extension of Q containing Q(¢,-) and the splitting fields of 
p and w, and Q is a suitable prime above gq in L. Then o belongs to 
the decomposition group of g in L, so the last condition clearly implies 
res,([~]) 4 0, and the first two obviously imply the other two restrictions 
imposed on gq. 
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Consider the following diagram of fields 


F = extension of Q(¢p») cut by Ker({) 
K = extension of Q(¢p) cut by Ker(ad°(a)) 4H cG= Im() 
|# 

Q(Cpn) 


Since Ker (Im(f) > Im(ad°(j))) are the scalars in Im(), we have 
H = Hk*/k* C G = Im(ad"(A)) C PGL2(k). 
Lemma 19. H!(K/Q, W*) =0. 


Let us postpone the proof of the lemma for the moment, and see how to 
use it to find a o as above. From the lemma, 


04 vlGr € Hom(Gxr, W*)GK/2), 


But ¥(Gx) is a Gg-submodule of W%*, hence by the irreducibility of a|/Gr, 
which implies the irreducibility of W* (see the argument in section 4.3), it 
is all of W*. 

Next we claim that it is possible to find 09 € Ggc¢,n) such that f(a) has 
distinct eigenvalues {a, G}. If this is not the case, 6(Gac,n)) is contained 
in the upper-triangular matrices, by an elementary computation with 2x2 
matrices. But then, either 6(Ggc,.)) is contained in a torus, in which case p 
is dihedral, and therefore p|Gz is not irreducible, or p|Gacc,.) has a unique 
invariant line, in which case it must be invariant under /, contradicting its 
irreducibility. 

The eigenvalues of 09 on W are a/G, 1, and G/a. These are also its 
eigenvalues on W* because oo fixes Cp. Thus (a9 — 1)W* 4 W%, and 
W(Gx) £ (70 - 1)W*. 

Now look for o = To9 with r € Gx. Note that 7 acts trivially on W and 
W*. Since f(r) is a scalar matrix, f(a) will still have distict eigenvalues, 
and clearly o leaves the p” roots of unity invariant. All that remains to 
check is that 


(5.1) Ve =TWe, +r = Ve, + Ur E (9 —1)W* = (99 — 1)". 

If o = go is not good, then by the above we can find a tT € Gx such that 

adding wW, will move wW,, out of (a9 — 1)W*. O 
Proof of the lemma: We need the following classification theorem, due 

to Dickson ({Hu], I1.8.27). 


Theorem 20. Any finite subgroup of PGLe(k) is one of the following: 
(1) a subgroup of a Borel group, 
(2) conjugate to PGlo(k’) or PSLo(k’) for a finite field k’, 
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(3) isomorphic to the dihedral group D2, for (n,p) =1, 
(4) isomorphic to Ag, Sa, or As. 0 


Let G = Im(f) = Gal(M/Q), where M is the splitting field of f. Let 
G= Im(ad°(A)) = Gal(N/Q), where N is the splitting field of ad°(f). Let 
Z = Ker(G — G) = Gal(M/N), the scalar matrices in Im(p). 


Case 1. Z # {+1}. Then det(Z) £1, so W7 = W implies that W*? = 0. 
Now Z is cyclic of order prime to p, and F/M is a p-extension since 
Q(¢,) C M Cc F = M(G@z), so Z lifts to a subgroup (still denoted 7) 
of Gal(F'/Q). From the inflation-restriction exact sequence we get in this 
case H!(F/Q, W*) = H'(F7/Q,W*7) =0. A fortiori H1(K/Q,W*) =0. 
Case 2. Z = {+1}, p > 3. Then @ fixes Q(C,) and therefore A = 
Gal(Q(¢p)/Q) is a quotient of G C PGLo(k). Using the classification theo- 
rem we see that G is contained in a Borel, or has order prime to p (the other 
groups don’t have a cyclic quotient of order p—1). The first option contra- 
dicts the irreducibility of 6. The second implies that H = Gal(K/Q(G,-)) 
also has order prime to p, so H}(K/Q,W*) = H}(Q(G-)/Q, W**). But 
Ww* = 0, or else W* is reducible, contradicting the absolute irreducibility 
of p|Gr. 

Case 8. Z = {+1}, p = 3. We may assume that 3|#G, and G is not 
contained in a Borel, otherwise we finish the proof as in step 2. Again, 
A = Gal(Q(¢3)/Q) is a quotient of G, so this rules out G = As, and we are 
left with the possibility G = PGLo(k’) (PSLe(k’) is simple if k’ 4 F3, and 
for F3 it is Ag, which does not have a normal subgroup of index 2. Also 
S4 = PGL2(F3)). 

If k’ = F3 then S, = PGLo(F3) = Gal(N/Q) has a normal subgroup 
V4 C Gal(N/L) (its unique 2-Sylow subgroup). Since Gal(K/N) is a 3- 
group, this V4 lifts to a normal subgroup of Gal(K/L). Since f|Gz, is abso- 
lutely irreducible, W*™ = 0, and inflation-restriction finishes the proof as 
before. 

There remains the case where k’ = F3n and n > 1. Then 


H = Gal(K/Q(pr)) & Gal(N/Q(Gp)) & PSLo(#’), 


and W = W* is the standard adjoint representation of PSL2(k’) on trace-0 
matrices. Wiles relies here on a result of Cline Parshall and Scott [CPS] that 
H'(H,W*) =0. With this the lemma is settled, and with it are concluded 
also the proofs of propositions 18 and 11, and of the main theorem. O 
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EXPLICIT FAMILIES OF ELLIPTIC CURVES 
WITH PRESCRIBED MOD N REPRESENTATIONS 


ALICE SILVERBERG 


INTRODUCTION 


In Part 1 we explain how to construct families of elliptic curves with the 
same mod 3, 4, or 5 representation as that of a given elliptic curve over Q. 
In §4 we give equations for the families in the mod 4 case. The mod 3 and 
mod 5 cases were given in [9] (see also [8]). The results remain true (with 
the same proofs) with the field of rational numbers replaced by any field 
whose characteristic does not divide the level. 

In Part 2 we use the work.of Wiles, Taylor-Wiles, and Diamond to give 
explicit equations for infinite families of modular elliptic curves. In §7 
(see Theorem 7.3) we show how to find infinite families of modular elliptie 
curves with the same mod 4 representation. In §8 we prove that if F is 
an elliptic curve over Q, and the torsion subgroup of E(Q) is not cyclic of 
order 1, 2, 3, 6, or 9 (i-e., the torsion subgroup is cyclic of order 4, 5, 7, 8, 
10, or 12 or is of the form Z/2Z x Z/2NZ for N = 1, 2, 3 or 4), then F is 
modular (see Theorem 8.1 and Corollary 8.10). 

The proofs of the results in §4 use symbolic computer computations, 
which were done using the programs Pari and Mathematica. I would like 
to thank Ken Ribet and Karl Rubin for useful conversations, and the IHES 
for its hospitality. 


Notation. Let Z, Q, and C denote, respectively, the integers, rational 
numbers, and complex numbers. 

We will suppose that N is a positive integer, and N > 3. If E is an 
elliptic curve over a field k with algebraic closure k, let E[N] denote the 
kernel of multiplication by N on E(k), and let j(£) denote the j-invariant 
of E. If F C Q is a number field, let Gr = Gal(Q/F). Let py be the 
Ga-module of N-th roots of unity, and let 


ey : E[N] x E[N] - tn 


denote the Weil pairing. Let § denote the complex upper half plane, and 
let 


TN) = {(24) € Slo(Z): (24) = (02) (mod N)}. 
Let J denote the 2 x 2 identity matrix. 
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Part 1. Elliptic curves with the same mod N representation 
1. MODULAR CURVES AND ELLIPTIC MODULAR SURFACES OF LEVEL N 


Let 
Vn = Z/ NZ x Ltn; 
a Ga-module, and define a Ga-equivariant pairing 


nn: Vn xX VN > [bby 
by 
mn ((@1,¢1); (@2,¢2)) = Co /C7?- 

Denote by Yy the (non-compact) modular curve over Q which parame- 
trizes triples (Z, P,C) where E is an elliptic curve, P is a point of exact 
order N on E, C is a cyclic subgroup of order N on E, and C and P 
generate E[N]. Let Y(N) denote the modular curve which parametrizes 


elliptic curves with full level N structure (see [12]). If ¢ is a fixed primitive 
N-th root of unity in Q, then the map 


(E, P,C) + (E, P,Q), 


where @ is the unique point in C’ such that en(P,Q) = C, induces an 
isomorphism (defined over Q(C)) from Yy onto one connected component 
of Y(N). Thus Yy(C) is isomorphic to H/T(N). Let Xy denote the 
compactification of Yy. Then Xjy has genus 0 if and only if N < 5 (see 
p. 23 of [12]). 


Lemma 1.1. The curve Yn parametrizes isomorphism classes of pairs 
(E,), where E is an elliptic curve and 


@ : Vn _ E(N] 
is a group isomorphism with the property that for all u, vu € Vy, 
nN (u, v) a en(@(u), &(v)). 


Proof. Given (E, P,C), define ¢ by 6(a,C) = aP+Q for the unique Q EC 
such that ew(P,Q) = ¢. Conversely, given (£,¢), let P = (1,1) and let 
C = G(0 x py). O 


There is a quasi-projective surface Wy defined over Q, with a projection 

morphism 
TN: Wn = Yn 

and a zero-section Yy — Wy, both defined over Q, with N? sections de- 
fined over Q of order dividing N, and such that the fibers of rn correspond 
to the triples (HZ, P,C) classified by Yy. (Note that this notation differs 
from that of [9], where Wy denoted a compactification.) The variety Wn 
can be viewed as the universal elliptic curve with level structure as above. 
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See [13] for the theory of elliptic modular surfaces of level N. Analytically, 
we have 


Wn(C) = (9 x C)/(TUN) * Z?). 
If r € %, then the equivalence class of r in §/I'(V) corresponds to the 
C-isomorphism class of the triple (C/Zr + Z,r/N,(1/N)). Let Wy[N] 
denote the N? sections of my of order dividing N, viewed as a Gg-module. 


2. TWISTS OF MODULAR CURVES AND ELLIPTIC MODULAR SURFACES 


Let Aut(Vi,7n) denote the group of automorphisms of Vy which pre- 
serve nn. Suppose V is a free rank-2 module over Z/NZ with a continuous 
and linear -Gg-action and suppose 

n:VxV-— pn 
is a non-degenerate alternating Gg-equivariant pairing. Fix a group iso- 
morphism » : Vw — V under which the pairing 7) corresponds to the 
pairing 7. Then 7 ++ yt or(y) defines a cocycle on Gg with values in 
Aut(Vy,7Nn). By the universal property of Wy, there is a natural injective 
Gg-equivariant homomorphism 
Aut(Vv,7n) @ Aut(Wy). 
There is also a natural Ga-equivariant homomorphism 
Aut(Wy) — Aut(Yy). 


Therefore, the above cocycle induces cocycles c and cp on Gg with values 
in Aut(Wy) and in Aut(Yn), respectively. Let W (respectively, Y) denote 
the twist of Wy (respectively, Yn) by the cocycle ec (respectively, co) (see 
[10]). Then W and Y are quasi-projective varieties defined over Q. Up 
to isomorphism, W and Y are independent of the choice of ». We obtain 
isomorphisms 
w:W—- Wyn and wo: Y — Yn 

defined over Q, and a projection morphism a : W — Y defined over Q, 
such that the diagram 


w + Ww 
La 7 
y +3 Yw 
commutes, and such that for every rT € Ga, 
e(r)=por(y)* and co(r) = $a o7(o)™*. 


It follows from the definition of W that if t ¢ Y(C) and £; is the elliptic 
curve 7 *(t), then E,[N] and V are isomorphic as Gal(Q(t)/Q(¢))-modules. 
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Theorem 2.1. Suppose N = 3, 4, or 5, and E is an elliptic curve over Q. 
Then there are infinitely many elliptic curves E’ over Q such that E[N] 
and E’|N] are isomorphic as Gg-modules. 


Proof. Let V = E[N] and let 7 = en, the Weil pairing on E[N]. Let W and 
Y be the varieties constructed as above, for V and 7, and let X denote the 
compactification of Y. Since FE is defined over Q, Y(Q) is nonempty. Since 
N <5, X has genus 0. Therefore, X is isomorphic to P’, and X(Q) and 
Y(Q) are infinite. The points of Y(Q) correspond to the desired elliptic 
curves E’. O 


3. MODELS 


Suppose now that NV = 3, 4, or 5 and # is an elliptic curve over Q with 
Weierstrass model y? = 2° +az+5, with a,b € Q. We will construct 
a model for W, where Y and W are the twists of Yy and Wy as in §2, 
with V = E[N] and 7 = ey. For N = 3, 4, and 5, let m = 1, 2, and 5, 
respectively. Then 12m = #PSL2(Z/NZ). 

We can find a model for Wy (see (1) and (3) of [9] for N = 3 and 
N = 5, respectively, and (5) below for N = 4) such that for u € Yn, the 
fiber Ey, = wy (u) is of the form 


(1) Ey: y? = 27 + a4(u)z + ag(u) 


where aa(u),ag(u) € Q{u] and deg(a;) = jm for 7 = 4,6. Let uo be 
an algebraic number such that F,,, is isomorphic (over Q) to E. The 
isomorphism wo : Y — Ywn extends to an isomorphism Ww : X ~ Xy 
on the compactifications. Since X and Xy are isomorphic to P!, the 
isomorphism wo can be given by a linear fractional transformation, which 
can be normalized so that 0 is sent to ug. Let 


A= (° ) € GL2(Q) 


be such a transformation. Since w takes a fiber €; = m~!(t) in W isomor- 
phically onto a fiber EF, = Ry (u) in Wy, the isomorphism w takes a point 
(t,z,y) € W CP! x P? to a point of the form (A(t), h(t), h(t)~3y) € 
Wy CP! x P?, for appropriate h(t). Therefore, (h(t)~°z, h(t)—3y) lies on 
Ea. Using (1), it follows that (¢, 2, y) satisfies 


(2) y= 2° +h(t)*aa(A(t))& + R(t)Pas(A(2)). 


When ¢ = 0, we would like (2) to be an equation for the elliptic curve E. 
We will solve for h(t)?, a, and y so that 


(3) h(t)*aa(A(t)) and — h(t)°ag(A(é)) 
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are in Q[¢] (i-e., so that (2) is a model over Q for W) and take on the values 
a and 6, respectively, when t = 0. From 


(4) h(0)*a4(A(0))=a and — h(0)®ag(A(0)) = 5, 


we can solve for h(0)?. Write h(t)? = h(0)?(yt + 1)?™ and substitute into 
(3). Then the expressions in (3) become polynomials in Q{é], and have 
constant terms a and b, respectively. In particular, (2) is # when ¢ = 0. 
Take any ordered pair of rational numbers (r,s) which is not a rational 
multiple of (4a, 6b), set the coefficients of ¢ in the polynomials in (3) equal 
to r and s, respectively, and solve for a and y. With these values, (2) 
is a model over Q for W. Different choices of the pair (r,s) give rise to 
Q-isomorphic elliptic surfaces. 
Let J = j(£)/1728. The resulting model for W is of the form 


y=n t+afalJ, t)x + bfg(J, t) 


where fa, fe € Z[J,t], f4 and fg depend only on N, and deg,(f;) = jm for 
j= 4,6. 


A. LEVEL 4 


We begin by writing down a model for the elliptic modular surface W4, 
following [14]. We can view W4 as a surface over Q or as an elliptic curve 
A, over a function field in one variable. Define 


ee ee oe Se 


iy 

Ifu € C and u ¢ {0,1,—1,i,—-7}, then A, is an elliptic curve over Q(u), 
and a Weierstrass model for A, is given by 

(5) By sy? = 2° —27(u8 + 14u* + 1)2 — 54(u}? — 33u8 — 33u4 + 1). 
We have 

16(u® + 14u4 +1)8 


(6) j(Au) = j(Eu) os u4(u4 =e 1)4 
Let r r 
u-+1i-uw 
sad ara a ) € Ay [4]. 


Let C,, be the Gal(Q(u) /Q(u))-invariant cyclic subgroup of A,,[4] generated 
by the point (of order 4) 
w+ i(u?+1)\(u—1)? 
( Qu’ Au? ) 
The map ut (A,, P,, Cy) induces a morphism f : P! — X4 defined over 
Q. The morphism j : X, — P! induced by (EF, P,C) ++ j(E) has degree 
#PSLo(Z/4Z) = 24. By (6), the degree of the composition jof is 24, which 
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is the same as the degree of j7. Therefore, f is an isomorphism. Identify 
X4 with P! via f. 
Next, we give models for the twisted surfaces W. 


Theorem 4.1. Fiz an elliptic curve over Q: 
E: yo=atit+art+s, 

with a,b € Q, and let J = j(E)/1728 = 4a3/(4a° + 27b*). Let €; be 

(7) Ex: y=2>+a(t)z+0(2), 

where 

a(t) = ((J — 1)4(144J% — 567 — 7)é — 48(J —1)4(47 + 1)é7 + 
28(J — 1)3(4J +.5)t® + 224(7 — 1)3¢° + 42(J — 1)?(47 —5)t4 — 
112(J — 1)2#3 + 28(J — 1)t? + 1)a, 

b(t) = ((J — 1)§(17287% — 1447 + 1167 + 1)t}? — 
12(J — 1)§(288.7% — 1287? + 827 + 1)¢t! + 
66(J — 1)°(48.J? — 567 — 1)t10— 
44(J — 1)*(208J2 — 176 J — 5)t? — 
99(J — 1)4(48J? — 1047 — 5)t® + 792(J — 1)3(8.J? — 107 — 1)¢7 — 
924(J — 1)3(4J + 1)t® + 792(J — 1)2t° — 99(47 — 5)(J —1)?t4 + 
44(J — 1)(6J —5)t® —66(J — 1)? + 12t+ 1)b. 

Then for every rational number t such that € is nonsingular, €,[4] is iso- 


morphic as a Ga-module to E[4]. If ab # 0, then (7) is a model for W 
over Q, where W is constructed as in §2 from V = E/4], n = ea. 

Proof. If a = 0 then &; is y? = 2° + (¢+ 1)! and if b = 0 then &; is 
y? = 2° +z. In both cases the elliptic surface is isotrivial, and &,[4] is 
isomorphic as a Gg-module to E[4]. Now assume ab # 0. Let j = j(£). 
Using (6), a computation shows that 


j(Eu)-j(E) = 
16(u74+1)+(672—7)(u29+u*) +(9456+4j)(ul®+u8) + (4524867) ul? 
u4(u4—1)4 ; 
Let uo be a root of the numerator. Then j(£,,) = j(£). Following the 


algorithm in §3, we deduce from (4) that 
b b(us + 14u¢ + 1) 2 
7) ego a le 
(0) ag(uo)a  2a(ug? — 33u8 — 33ug + 1) 5S 


Now solve for a and ¥ so that the coefficients of ¢ in the polynomials in 
(3) are 0 and 126, respectively. (This choice (r,s) = (0,12b) leads to the 
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relatively simple polynomials a(t) and b(t) in the statement of the theorem.) 
We obtain 
js (7ug + 1)b 7 (ug + 7)b 
~ 3235u3(1 — ud)3h(0)®’ =? -2238(ua — 1)3h(0)8 
With these values, (2) is a model over Q for W, and (2) is (7) with the 
stated a(t) and b(t). This elliptic surface is not isotrivial. 0 


Theorem 4.2. Fiz a nonzero integer D and define €, by 
y? = 2° +a(t)z + W(t) 
where 
a(t) = D(81D7t* + 6D#? + 1)(81D7¢* — 90D? + 1), 
b(t) = 8D7t(9D#? + 1)(9D7t* — 2Dt* + 1)(729D72* — 18D2? +1). 


If t € Q and 9Di? # 1, then €; is an elliptic curve over Q and €,[4] is 
isomorphic as a Ga-module to E[4], where E is the elliptic curve 


y? = 2° + Dz. 
Proof. Using (6), a computation shows that 
_ 24(u? — 2u — 1)?(u? + 2u — 1)?(u* + 1)?(u*t + 6u? + 1)? 
= u4(u4 —1)4 


Let up = 1+ V2, a root of u? — 2u — 1. Now follow the algorithm in §3. 
We obtain 


j(Eu) — j(E) 


D  __ (12V2-—17)D 
aa(uo) 7 2434 


h(0)* = 


Let 
(2/9 — 3)/=D 


Og r=0, and s=8D”. 


Then 
a=3V—-D, y= -3(1+V2)V—D, 

and we obtain € as in the statement of the theorem. The discriminant of 
&: is 

A(E:) = —2°.D3(9Dt? — 1)*(81D7t* + 54Dt? +1). 
Thus if ¢ € P1(Q) and 9D? # 1, then &; is an elliptic curve. The j- 
invariant of €; is 
1728(81D7t* + 6 Di? + 1)9(81D7t* — 90D?? + 1)8 


ae 
jet) (9D2 — 1)4(81D?t4 + 54D + 1) 
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Theorem 4.3. Fix a nonzero integer D and define E, by 
y” = 2° — 12Dt(8Dt? — 1)(Dt? + 1)z — D(8D7t® + 88De° — 1)(8D7t° + 1). 


For every rational number t, €; is an elliptic curve over Q and €,[4] is 
isomorphic as a Ga-module to E[4], where E is the elliptic curve 


y? =2°+D. 
Proof. We have j(£) =0 and 
_ 24(ut — 2u3 + Qu? + Qu + 1)3(ut + 2u3 4 Qu? — 2u 4 1)8 
j(Eu) = aaa 
Let up be a root of ut — 2u® + 2u? + 2u+1. Let 
(396 + 18)?/3p1/3 


GB = uo(1 + uo — U3) and A= 5 


Applying the algorithm of §3, we have 


n(0)® = ae 84) 
Let 
h(0)? = ds r=12D, and s=0. 
Then 


a = (1lug — 33u2 + 49u9 —11)A, 7 = (36 —19)d, 
and we obtain & as in the statement of the theorem. The discriminant and 
j-invariant of €; are given by 
A(&) = —243°-D?(8pD7t8 — 20Dt? — 1), 
(&,) = —2)433 DB (8D! — 1)3(D# + 1)3 
a (824° — 20D23 — 1)4 
Since A(€,) has no rational roots, the theorem follows. oO 


Part 2. Explicit families of modular elliptic curves 
5. MODULAR j-INVARIANTS 


If # and £’ are elliptic curves over Q, and F and E’ are isomorphic 
over C, then £ is modular if and only if E’ is modular. It therefore makes 
sense to talk about modular j-invariants, i.e., the rational numbers which 
are j-invariants of modular elliptic curves. Before the work of Wiles, it was 
not known that there are infinitely many modular j-invariants. Using the 
results of Wiles [19], Taylor-Wiles [18], and Diamond [2], it is now very 
easy to write down infinite families of modular j-invariants. 

We begin by stating Diamond’s improvement of the results of Wiles and 
Taylor-Wiles. While Theorems 7.3 and 8.1 below follow easily from this 
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statement, in fact such results generally follow from the theorems stated in 
[19], with some additional work. 


Theorem 5.1. If E is an elliptic curve over Q which has semistable re- 
duction at 3 and at 5, then E is modular. 


Proof. See Theorem 1.2 of [2]. Oo 


6. SEMISTABLE REDUCTION 


We next state some results which will be used in the proofs of Proposi- 
tions 7.1 and 8.4. If F' is a number field, and vu is a prime ideal of F’, let 
TL, denote the inertia subgroup of Gr corresponding to an extension to Q 
of the v-adic valuation on F’. 


Theorem 6.1. If E is an elliptic curve over Q, then there is a number 
field over which E has everywhere semistable reduction. 


Proof. See Proposition 3.6 of [4]. See also Proposition 5.4 on p. 181 of 
[17]. Oo 


Theorem 6.2. If E is an elliptic curve over Q, p and £ are distinct prime 
numbers, Ppp : Gq — Glo(Zp) is the p-adic representation associated to 
E, and +r € Tp, then the characteristic polynomial of pp(r) has integer 
coefficients which are independent of p. 


Proof. See Theorem 4.3 of [4]. Oo 


Theorem 6.3. Suppose E is an elliptic curve over a number field F’, p 
and £ are distinct prime numbers, v is a prime ideal of F dividing £, and 
Pp: Gr — Glo(Zp) is the p-adic representation associated to E. Then E 
has good reduction at v if and only if pp(Z,) is trivial. 


Proof. See Theorem 1 of [11]. Oo 


Theorem 6.4. Suppose E is an elliptic curve over a number field F, p 
and £ are distinct prime numbers, v is a prime ideal of F dividing £2, and 
Pp: Gr — Glo(Zp) is the p-adic representation associated to E. Then the 
following are equivalent: 

(i) E has semistable reduction at vu, 

(ii) for every rT € Ly, all the eigenvalues of pp(r) are 1 (i.e., Z, acts 

unipotently on the p-adic Tate module of E), 
(iii) for every r € T,, (pp(T) — I)? =0. 


Proof. See Proposition 3.5 and Corollaire 3.8 of [4]. 5 
Let Z denote the ring of algebraic integers. 


Theorem 6.5. If a is a root of unity in Z, and either 
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(a) 5S N €Z and (a—1)? € NZ, or 
(b) 3S NE€Z anda-1e€ENZ, 
thena= 1. 


Proof. Part (b) is well-known. See Theorem 3.1 of [16] for proofs of (a) 
and (b). Oo 


Lemma 6.6. Suppose that N 1s a positive integer, and for each prime 
divisor p of N we have a matrit Ap € Mo(Zp) such that the characteristic 
polynomials of the Ap have integral coefficients independent of p, and such 
that (Ap—I)? € NMo(Zpy). Then for every eigenvalue a of Ap, (a—1)/VN 
satisfies a monic polynomial with integer coefficients. 


Proof. See Lemma 5.2 of [15]. O 


7. Mop 4 REPRESENTATIONS 


Proposition 7.1. Suppose E and E’ are elliptic curves over Q, N is a 
positive integer, E[N] and E’[N] are isomorphic as Ga-modules, and £ is 
a prime number which does not divide N. If 

(a) N >5 and E has semistable reduction at £, or 

(b) N =3 or 4 and E has good reduction at £, 
then E’ has semistable reduction at £. 


Proof. We give a proof in the spirit of [15]. Let Zp denote the inertia sub- 
group of Gq corresponding to an extension \ to Q of the ¢-adic valuation 
on Q. Suppose r € Zz, pis a prime divisor of N, and pgp and pry are the 
p-adic representations of Gq associated to H and E’, respectively. Sup- 
pose a is an eigenvalue of pg p(7). There is a number field F’ such that EF 
has semistable reduction at the restriction \ of \ to F (by Theorem 6.1). 
Therefore, 7™ € Z) for some positive integer m. By Theorem 6.4, 


(per p(T)™ — 1)? =0. 
Thus, (a= 1)? = 0 soe" = 1. 
Since E has semistable reduction at @, 
(pr.p(T) —1)* =0 
by Theorem 6.4. Since E[N] = E’[N], we have 


PE,p(T) — PE’,p(T) € NMo(Zp) 
for appropriate choices of bases for the p-adic Tate modules of F and of 
E’. Therefore, 
(per p(r) — 1)? € NMa(Zp). 
The characteristic polynomial of pxp(T) is independent of the choice of 
prime divisor p of N (by Theorem 6.2). By Lemma 6.6, (a — 1)? € NZ. 
Suppose NV > 5. By Theorem 6.5a, a = 1. Therefore, Z, acts on the Tate 
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module of E’ by unipotent operators. By Theorem 6.4, E’ has semistable 
reduction at £. Now suppose N = 3 or 4 and E has good reduction at @. By 
Theorem 6.3, pz,p(r) =I. Therefore, pxp(r)—I € NMo(Zp), a—-1 € NZ, 
and a = 1 (using Lemma 6.6 and Theorem 6.5b). By Theorem 6.4, E’ has 
semistable reduction at 2@. O 


Examples 7.2. To see that Proposition 7.1a fails for N = 3 or 4, let E be 
the elliptic curve y? = 2? +2+1. The conductor of EF is 2*- 31, so E has 
multiplicative reduction at 31. Consider Theorem 4.1 with a = b= 1 and 
let E’ be the elliptic curve obtained by letting t = 1 (letting t = 0 gives E). 
Then E/[4] & E’[4], and it is easy to check that E’ has additive reduction at 
31. Theorem 4.1 of [9] is the analogue of Theorem 4.1 of this paper, with 
N = 3 instead of N = 4. Consider Theorem 4.1 of [9] with a = b = 1, and 
let E” be the elliptic curve obtained by letting ¢ = 1 (again, t = 0 gives E). 
Then E[3] & E’’[3], and it is easy to check that E” has additive reduction 
at 31. 

To see that Proposition 7.1b fails for N = 2, let 2 be an odd prime, 
let E be the elliptic curve y? = 2° — a, and let E’ be the elliptic curve 
y? = 2° — #2. Then E has good reduction at & (the conductor of E 
is 2°), E’ has additive reduction at £ (the conductor of E’ is 2°27), and 
ED] © E'2] © Z/2Z x Z/2Z. 


Theorem 7.3. If E is an elliptic curve over Q which has good reduction 
at 3 and at 5, and E’ is an elliptic curve over Q such that E[4] and E’[4| 
are isomorphic as Ga-modules, then E’ is modular. 


Proof. By Proposition 7.1b with N = 4, E’ has semistable reduction at 3 
and at 5. Therefore E’ is modular by Theorem 5.1. oO 


Therefore, the explicit families of §4 give infinite families of modular 
elliptic curves, as long as one of the elliptic curves in the family has good 
reduction at 3 and at 5. 


8. TORSION SUBGROUPS 


Theorem 8.1. If E is an elliptic curve over Q which has: 
(1) all its points of order 2, 
(2) a cyclic subgroup of order 4, 
(3) @ point of order 5, or 
(4) a point of order 7, 
defined over Q, then E is modular. 


We prove Theorem 8.1 in a series of lemmas. 


Lemma 8.2. If E is an elliptic curve over Q, and E(Q) D Z/2Z x Z/2Z, 
then E is modular. 
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Proof. It is easy to see that F is isomorphic over C to an elliptic curve E’ 
of the form 
2 = a(2 — A)(z — B) 

where A and B are relatively prime integers. Suppose p is an odd prime. 
Since the right hand side does not have a triple root modulo p, E”’ has 
semistable reduction at p. In other words, one can twist away any additive 
reduction on F at odd primes. The lemma now follows from Theorem 
5.1. CO 


Lemma 8.2 was an observation made with K. Rubin. The main theorem 
of [3] shows that Lemma 8.2 follows from the results of [19] and [18], without 
using [2]. 


Lemma 8.3. If E is an elliptic curve over Q, and E has a cyclic subgroup 
of order 4 defined over Q, then E is modular. 


Proof. Suppose C is a rational cyclic subgroup of E of order 4. Let D = 
Cn E[2]. Then D is a subgroup of E of order 2, and D is defined over Q. 
Let E’ = E/D, an elliptic curve over Q. The quotient map y: FE — E’ is 
an isogeny defined over Q. Therefore to show that F is modular, it suffices 
to show that E’ is modular. Fix a generator x of C' and fix y € E[2| — D. 
Then y(t) generates C/D, which is a rational subgroup of E’ of order 
2. Therefore, y(r) is defined over Q. Similarly, y(y) generates E[2]/D, 
a rational subgroup of E’ of order 2, so y(y) is defined over Q. Since 
z—y¢C, we have t—y ¢ D, so v(x) # y(y). Therefore, E” has all its 
points of order 2 defined over Q. By Lemma 8.2, E’ is modular. D 


Proposition 8.4 (Néron, Frey, Flexor-Oesterlé,...). 
If E is an elliptic curve over Q,5< N € Z, and E(Q) > Z/NZ, then E 
has semistable reduction at every prime which does not divide N. 


Proof. We give a proof from [15] (see Theorem 6.2). Suppose @ is a prime 
which does not divide N. Let Z, denote the inertia subgroup of Gg cor- 
responding to an extension A to Q of the &adic valuation on Q. Suppose 
T € Ty. Since £ does not divide N, Q(¢n) is unramified at £, so Tz acts as 
the identity on the N-th roots of unity. Suppose p is a prime divisor of N, 
and let pp : Ga — Glo(Z,) denote the p-adic representation associated to 
E. Since E(Q) has a point of order N, 


1 x 
pelt) = i q (mod NMa(Zp)) 
for an appropriate choice of basis for the p-adic Tate module of EF. There- 
fore, (pp(T) — I)? € NMa(Zp). There is a number field F such that E has 
semistable reduction at the restriction \ of \ to F (by Theorem 6.1). Then 
7™ €T, for some positive integer m. By Theorem 6.4, (p,(r)™ — I)? =0. 
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Let a be an eigenvalue of p,(7). Then (a” — 1)? = 0, soa” =1. By 
Lemma 6.6 and Theorem 6.2, (a — 1)? € NM2(Z). By Theorem 6.5a, 
a= 1. Therefore, Zz acts on the Tate module of F by unipotent operators. 
By Theorem 6.4, # has semistable reduction at 2. O 


Examples 8.5. Proposition 8.4 fails for N < 5. The point (48, —15) is a 
point of order 4 on the elliptic curve y?+z2y = 2°—2?—70562+ 229905, and 
this curve has additive reduction at 3 (it is the curve 63A5 in Cremona’s 
tables {1]). The point (1,1) is a point of order 3 on the elliptic curve 
y? = 23 +27 — z, and this curve has additive reduction at 2 (it is 20A2 in 


[1]). 


Lemma 8.6. If E is an elliptic curve over Q, and E(Q) > Z/7Z, then E 
is modular. 


Proof. By Proposition 8.4 with N = 7, EF has semistable reduction at 3 
and at 5. By Theorem 5.1, & is modular. O 


Theorem 8.7. If E is an elliptic curve over Q, p is an odd prime num- 
ber, E has semistable reduction at p, the mod p representation pry for 
E is modular, and the restriction of pr,p to G Q(/aie-D 7p) is absolutely 


irreducible, then E is modular. 


Proof. This is Theorem 5.3 of [2], applied to the p-adic representation as- 
sociated to the elliptic curve E. 0 


Proposition 8.8. If E is an elliptic curve over Q, the mod 5 represen- 
tation for E is reducible, and the restriction to Gg yaa) of the mod 3 
representation for E is not absolutely irreducible, then E is modular. 


Proof. See Proposition 13 of [7] (by work of J. E. Cremona, the elliptic 
curves over Q whose j-invariants are given in Proposition 13 of [7] are all 
modular), p. 544 of [19], or the proof of Theorem 5.4 of [2]. O 


Lemma 8.9. If E is an elliptic curve over Q, and E(Q) > Z/5Z, then E 
is modular. 


Proof. Since E(Q) > Z/5Z, the mod 5 representation for EF is reducible. 
By Proposition 8.8, we may assume that the restriction to Gg, x3) of the 
mod 3 representation for & is absolutely irreducible. The mod 3 represen- 
tation for F is modular by the Langlands-Tunnell Theorem. By Proposi- 
tion 8.4, EF’ has semistable reduction at 3. By Theorem 8.7, EF is modu- 
lar. O 


Theorem 8.1 follows from Lemmas 8.2, 8.3, 8.6, and 8.9. 
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By [6], if E is an elliptic curve over Q, then the torsion subgroup of 
(Q) is isomorphic to one of the following 15 groups: 


Z/NZ for N =1,... ,10 or 12, 
Z/2ZxZ/2NZ for N =1,2,3, or 4. 
This and Theorem 8.1 immediately imply the following result. 


Corollary 8.10. If E is an elliptic curve over Q, and the torsion subgroup 
of &(Q) is not a cyclic group of order 1, 2, 3, 6, or 9, then E ts modular. 


Given a model for an elliptic curve E over Q, the Nagell-Lutz Theo- 
rem (see Corollary 7.2 of Chapter VIII of [17]) provides an algorithm for 
computing the torsion subgroup of E(Q). 


9. EXPLICIT FAMILIES OF MODULAR ELLIPTIC CURVES 


By Theorem 8.1, the elliptic curves below are modular. See Table 3 on 
p. 217 of [5] for such parametrizations of elliptic curves over Q with points 
of finite order. 


Example 9.1 (rational 2-torsion). 
Elliptic curves over Q with all points of order 2 defined over Q are given 
by 
Dy? = x(x —1)(x — A) 

with D,A € Q*,A #1. The corresponding (modular) j-invariants are all 
the numbers of the form 

28(4? -A+1)8 

A?(A — 1)? 

Example 9.2 (rational cyclic subgroup of order 4). 


The family of elliptic curves with a rational cyclic subgroup of order 4 is 
given by 


with A € Q-— {0,1}. 


y? = 2° + D(1 — 4b)2? — 8D7bz + 16D? 
with b, D € Q*,b 4 —%. The rational cyclic subgroup of order 4 is 
{0, (4bD, 0), (0, +4bDV'D)} 
and the j-invariant is 
(16b7 + 165 + 1)° 

(16b + 1)d+ 
Example 9.3 (rational points of order 5). 
The family of elliptic curves over Q with a rational point of order 5 is given 
by 

y? + (1 —b)zy — by = 2° — ba? 

with b € Q*. The point (0,0) has order 5. 
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Example 9.4 (rational points of order 7). 
The family of elliptic curves over Q with a rational point of order 7 is given 


by 


y* + (1—d(d—1))zy — d?(d —1)y = 23 — d?(d—1)2? 


with d € Q— {0,1}. The point (0,0) has order 7. 
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MODULARITY OF MOD 5 REPRESENTATIONS 


KARL RUBIN 


INTRODUCTION 


The aim of this paper is to tie everything together to prove the following 
theorem. 


Theorem A (Wiles [19] + Taylor & Wiles [17] + Diamond [3]). 
If E is an elliptic curve defined over Q and E is semistable at 3 and 5, 
then E is modular. 


Our starting point will be Theorems B and C below. If F is an elliptic 
curve defined over a field F of characteristic zero, let Gr = Gal(F'/F) and 
for every prime p, let 


PrE,p : Gr > Glo(Fp) 


be (the isomorphism class of) the representation of Gr on the p-torsion 
E[p| in E(F). We will write simply fz,p for fa,z,p- 


Theorem B (Wiles [19] + Taylor & Wiles [17] + Diamond [3]). 
Suppose E is an elliptic curve over Q, and p is an odd prime, such that 


® F is semistable at p, 


® prp restricted to Gal(Q/Q(/(—1)®-})/2p)) ts absolutely irreducible, 


and 
® PE,p is modular. 


Then FE is modular. 
Theorem C (Langlands [6] + Tunnell [18]). Suppose p: Gg — GLo(F3) 
is a continuous representation satisfying 


@ p is irreducible, and 
@ det(e(compler conjugation)) = —1. 


Then p is modular. 


Partially supported by an NSF grant. The author also thanks the Institut des Hautes 
Etudes Scientifiques and the Institute for Advanced Study for their hospitality and 
support. 
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Theorem B (for p = 3) and Theorem C together prove Theorem A 
under the extra hypothesis that pz,3 restricted to Ga@y-3) is absolutely 
irreducible. When fz,3 restricted to Gq(y—gy is not absolutely irreducible 
we will complete the proof of Theorem A by applying Theorem B with 
p = 5. We need to show that the hypotheses of Theorem B are satisfied, 
and this will follow from the following two theorems: 


Theorem 1. Suppose FE is an elliptic curve over Q such that 
® pps restricted to Go Y=3) 18 not absolutely irreducible, 
® EF is semistable at 5, and 
@ the j-invariant j(E) is not 113/23 or —293413/2}°. 

Then pgs restricted to Coy YB) #5 absolutely irreducible. 


Theorem 2. Suppose E is an elliptic curve over Q such that 
9 F is semistable at 3, and 
® pes is irreducible. 

Then prs is modular. 


Theorem A is immediate from Theorems B, C, 1 and 2, together with 
the observation that the exceptional curves in Theorem 1 are modular. 
(More precisely, these curves are twists of the curves listed as 338E1 and 
338E2 in [2]. Although it is stated in §2.14 of [2] that these models are not 
proved to be modular, John Cremona informs me that due to subsequent 
progress the computations in [2] do suffice to prove that these curves are 
the modular curves they appear to be.) 

The fundamental hypothesis in Theorem B is that fzp is modular. 
When p = 3 this is known because of Theorem C, and when p = 5 we 
will use Theorem 2. To prove Theorem 2 we must have at our disposal a 
large collection of modular forms. The modular forms we use will be the 
ones produced by Theorems B and C when p = 3. 

The proofs of Theorems 1 and 2 are essentially due to Wiles, and can be 
found with varying degrees of generality and detail in [19] Chapter 5 and 
in [3], proof of Theorem 5.4. 

If F is an arbitrary field, let Gr = Gal(F*/F) where F* is a separable 
closure of F. If p is a prime not equal to the characteristic of F let xp: 
Gr > Aut(u,) — FX be the cyclotomic character. 

The following statement was first pointed out to me by Richard Taylor, 
and is proved in [11]. Although not necessary for the proof of Theorem A, 
it is related to Theorem 2 (see Theorem 4 below) and we will prove it in §5. 


Theorem 3. Let p be3 or 5, and let F be a field of characteristic different 
from p. Suppose p: Gr — GLo(F,) ts a representation such that det(p) = 
Xp- Then there is an elliptic curve E defined over F such that p = prz.p- 


Combining Theorems 2 and 3 (with F = Q) gives the following. 
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Theorem 4. Suppose p: Gq — GLo(Fs) is a representation satisfying 


2 p is semistable at 3 (i.e., the image of an inertia group at 3 is unipo- 


tent), 
@ p is wrreducible, and 
@ det(p) = Xs. 


Then p is modular. 


Remark. The proof of Theorem 1 depends in a crucial way on the fact 
that the genus of the modular curve X9(3-5) is greater than zero, and the 
proof of Theorem 2 on the fact that the genus of X(5) is equal to zero. For 
primes p (see [13] §1.6), 


genus(X9(3p)) > 0 p> 5, 
genus(X(p)) =O@p<5. 
Thus 5 is the only prime that will work for Wiles’ argument. 
Acknowledgment. I would like to thank Alice Silverberg for helpful conversations. 


1. PRELIMINARIES: GROUP THEORY 


Lemma 5. Suppose E is an elliptic curve over a field F and p > 2 is 
prime. If F has an embedding into R. and pr.z,y 1s irreducible, then prn,p 
is absolutely irreducible. 


Proof. Fix a complex conjugation tT € Gr and write E[3]* (resp. E[3]—) 
for the subspace of E[3] where 7 acts via +1 (resp. —1). The characteristic 
polynomial of fr.zp(T) is z? — 1, so dim, E[3]* = dim, E[3]— = 1. 

Suppose Prz,p is not absolutely irreducible, i.e., there is a one-dimen- 
sional subspace W of E[3] @ F, which is stable under Gr. Then 


W = E[3]* @F, 
for some choice of sign, since no other one-dimensional subspace is stable 


under T. Hence E[3]* = WN E[3] is stable under Gr, so fr£y is reducible. 
C] 


Proposition 6. Suppose E is an elliptic curve over Q such that 
® fE,3 1s trreducible, and 
@ the restriction of pz,3 to Gaya) 18 not absolutely wrreducible. 


Then there are distinct subgroups C3, C4 of order 3 of E[3] such that the 
unordered set {C3, C3} is ficed by Gq. 


Proof. Fix a complex conjugation r € Gq. Our hypotheses on pg.3 imply 
(using Lemma, 5) that there is a one-dimensional subspace W of E[3] @ F3 
which is stable under Gq ja) but not under Ga. Then W7 is also stable 
under Gay ya), and E[3]@F3; =W ew. 
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It follows that pr,3(Gaiy=a) is a cyclic subgroup of SL2(F3), of order 
prime to 3, and so the image of G'g;,—3) in PGL(F3) has order at most 2. 
Let C denote the set of 4 subgroups of order 3 of E[3], and let E[3]* €C be 
the subgroup of points fixed by complex conjugation. Then Gg acts on C by 
a subgroup of S4 of order at most 4. Since 7 fixes E[3]* but acts nontrivially 
on C, the Gq-orbit of E[3]* has order at most 2. The assumption that fz 3 
is irreducible implies that this orbit has order exactly 2, so this orbit is the 
desired set {C3, C3}. 0 


Proposition 7. Suppose FE is an elliptic curve over Q such that 
® PEs is irreducible and 
9 FE is semistable at 5. 


Then the restriction of prs to Gav) ts absolutely irreducible. 


Proof. Let Is C Gq denote the inertia group of some prime above 5, and 
fix o € Is such that o € Goygy (possible since 5 ramifies in Q(V5)/Q). 
Suppose that F& satisfies the hypotheses of the proposition but the restric- 
tion of pz,5 to Ca Js) is not absolutely irreducible. By Lemma 5, this 
restriction is then reducible, so there is a subgroup C5 of [5] which is sta- 
ble under Gag) but not under Gq. Then C¥ is also stable under Ga jg), 
and E|5] = Cs @ CZ. With a basis of E[5] chosen from Cs and CZ, since 


det(pz,5(Gacvs))) = X5(Garys)) = {+1} and 
det(fz,5(7)) = xs() ¢ {£1} 


we see that 


Be s(Cquysy) C{($£.):¢EFS} and 
PE S(TGaysy) C {(8. 6) ae FS}. 


Case I: E is ordinary or multiplicative at 5. Since o € Is, o stabilizes a 
proper subgroup of E/[5] (see [9], Proposition 13 and the remarks before 
Proposition 11). But the description above shows this is impossible. 


Case IT: E is supersingular at 5. Proposition 12 of [9] shows that pz 5(Js) 
is cyclic of order 24. Again, the description above shows that fz 5(Gq) has 
no such subgroup. i 


2. PRELIMINARIES: MODULAR CURVES 


Write 71 for the complex upper half plane. If I is a congruence sub- 
group of SLo(Z), #/T will denote the compactification of #/I’, obtained 
by adjoining a finite set of cusps ({13] §1.3). 
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We will need the congruence groups 
P(N) = {(¢8) €Ska(Z): (28) =(§8) (mod N)} 
To(N) = {(25) €SLe(Z):c=0 (mod N)} 
Dspit(N) = {(25) € SLo(Z) : either b=c=0o0ra=d=0 (mod N)}. 


Lemma 8. As in [13] §1.6 define 
wT) = [SL2(Z) : {£1}T], 
Yoo (I) = number of cusps of H/T, 
vwi(C) = number of elliptic points of order i of H/T, 1 = 2 or 3. 
Then we have the following table: 


Te ea Ta oer 
Se 10 
To(5)N Tot 5 ee ea ee ae eee 
T@)nro(s) |e [a [op ope 
FG) ATeeue(3) | 360 | 26 [0 | 0) 18 

Proof. Let T denote one of the four groups in the table. Then (15) Cc 
SL2(Z), so u(T) is the index of the image of [ in PSL2(Z/15Z), which is 
easily computed. 

The cusps of #/I'(15) are described in [13] §1.6: there are 96 of them, 
and they can be identified with the coset space PSLo(Z/15Z)/U where U 
is the subgroup generated by (§/). From this it is not difficult to work 
out the number of cusps for each of the four groups (for I'9(15) it is done 
in Proposition 1.43 of [13]). 

For elliptic points, Propositions 1.43 and 1.39 of [13] cover all of the 0 
entries in the table, and vo(I'9(5) MT'sprit(3)) can be worked out with some 
effort using the methods of [13] §1.6. 

The last column follows from Proposition 1.40 of [13], which says that 
the genus of #/T is 1+ p(T)/12 — ve(T)/4— v3(T)/3 — vo (T)/2. O 


2.1. The modular curve X9(15). Let Xo(15) be the modular curve over 
Q which parametrizes isomorphism classes of pairs (E, Cis) where E is an 
elliptic curve and C\s5 is a subgroup of F of order 15. 


Lemma 9. (i) Xo(15) ts a curve of genus one with four cusps, all of 
which are rational over Q. 


(ii) #(X0(15)(Q)) = 8. 


(iii) The four rational points of Xo(15) which are not cusps correspond to 
four pairs (Ei, cl) where 
j( Fi) € {-25/2, —572413 23, —5-293/25, 5-2119/2)5}. 
(iv) If E is an elliptic curve over Q with a rational subgroup of order 15, 
then E is not semistable at 5. 
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Proof. Since Xo(15)(C) = H/T9(15), the genus and the number of cusps 
of X9(15) are given in Lemma 8. The fact that the four cusps are rational 
follows from the action of Gg on the cusps (see for example [5] §VI.5). 

From the tables [1] 15C or [2] 15A1 one can read off that Xo(15) has 8 
rational points. In the same tables one finds 4 elliptic curves over Q (50A1, 
50A2, 50A3, and 50A4 in [2]), with invariants 


j = —25/2, —572419/23, —5-299/2°, and 5-2113/2)°, 


respectively, each isogenous to Ey : y7+zy+y = 2° — x — 2 and having 
a rational subgroup of order 15. Thus these four curves represent all the 
non-cusp rational points of Xo(15). This proves (iii). 

Suppose F is defined over Q and has a rational subgroup Cj; of order 
15. Then (E, Cys) € X9(15)(Q), so by (iii), E is isogenous to a quadratic 
twist of Bp. The equation above for Eg is minimal and is not semistable 
at 5. Since it has discriminant —2-5*, this equation remains minimal over 
any quadratic extension of Q (see for example [16] Exercise 7.1). It follows 
that E is not semistable at 5. O 


2.2. The modular curve Xo spiit(5,3). Let Xo sptit(5,3) be the modular 
curve over Q which parametrizes triples (E,C5, {C3,C3}) where F& is an 
elliptic curve, C5 is a subgroup of order 5, and {C3,C3} is an unordered 
set of distinct subgroups of order 3. 


Lemma 10. (i) Xosprit(5, 3) ts a curve of genus one with four cusps, all 
of which are rational over Q. 
(ii) #(Xospiie(5,3)(Q)) = 8. 
(iii) The four rational points of Xo,sput(5,3) which are not cusps corre- 
spond to four triples (E;,...) where j(E,) € {119/2°, —29°413/2'>}. 


Proof. The classical theory of modular curves shows that 
Xo,spiit (5, 3)(C) = H/(L0(5) NT'spiie(3)). 


Thus the genus and the number of cusps of Xo spit(5,3) are given in 
Lemma 8. The fact that the four cusps are rational follows from the action 
of Gg on the cusps (see for example [5] §VI.5). 

Define a map f from Xp spiit(5,3) to the jacobian Jg(15) of X9(15) by 


f : (Z, Cs, {C3, C3}) re (£, Cs + C3) + (B, Cs + C3) as 2[co] 
where [co] denotes the infinity cusp on X9(15). Then f is not constant, 
because if E is an elliptic curve without complex multiplication by Q(?) 
and C3, C4, Cy are three distinct subgroups of order 3, then 
f(E,Cs, {Cs, C3}) — f(B, Cs, {Cs, Cz}) = (B, Cs + C3) — (EB, Cs + C3) 


which is not zero in Jo(15) since (F,C5 + C4), (£, Cs + C4) represent dis- 
tinct points of X9(15) and Xo(15) has genus 1 (Lemma 9(i)). Therefore 
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Xo split(5, 3) is isogenous to X9(15) and so #( Xo sprit (5, 3)(Q)) < 8 (see [1] 
or [2]). 

In [2], one finds that the elliptic curve denoted 338E1, BE: y?+2y+y= 
g3?+27+32—5 has a rational subgroup of order 5. Changing variables to the 
model Y? = X3+5X?+56X — 304, one computes that the X-coordinates 
of the points of order 3 are the roots of (X2+12X +192)(3X7—16X — 48). 
Thus £ gives rise to two of the non-cusp rational points of Xo spit (5, 3), 
corresponding to the two choices of Gaq-stable pairs of subgroups of order 
3. (These two points are distinct because Aut(#) = +1.) One computes 
easily that 7(£) = 113/2°. The quotient of E by its rational subgroup of 
order 5 (denoted 338E2 in [2]) gives rise to the other two non-cusp points 
of Xo spit, 3) (Q) ‘and has 7-imvariant —298415 f 215. ia 


2.3. Twists of modular curves. Suppose p > 2 is a prime. As in [15] §1 
or [7] §1, there is an open modular curve Y, over Q whose points correspond 
to isomorphism classes of pairs (,¢) where F is an elliptic curve and 
o: Elp] > F, x p, is an isomorphism such that det(¢) : A?E[p] — p, is 
the Weil pairing. Define X, to be the compactification of Y,. When p = 3 
or 5 (which are the only cases we will use) X, has genus zero, and explicit 
models are given in [7]. Over Q(#2,), a choice of primitive p-th root of unity 
induces an isomorphism from X, to one component of the usual modular 
curve X(p). 

More generally, suppose that V is a 2-dimensional F,,-vector space with 
an action of Gg and 7: A?V —> }, is a non-degenerate alternating pairing. 
As explained in [7] Remark 2.4 or [15] §2, there is an open curve Yy defined 
over Q whose points parametrize isomorphism classes of pairs (E, @) where 
E is an elliptic curve and ¢: E[5] — V is an isomorphism such that 


710 det() : A7E[5] > A?7V > p, 
is the Weil pairing. Define Xy to be the compactification of Yy. In par- 
ticular Xp = X¥, xp. 
Now suppose p = 5, F& is an elliptic curve, V = E/5], and 7 is the 


Weil pairing. We will write Ye and Xz for Yy and Xy, and we have the 
following explicit result ([7] §5). 


Proposition 11. Suppose E is an elliptic curve over Q and p= 5. There 
is an isomorphism wy : P!'—+Xg defined over Q and two polynomials 
fa(t), ge(t) € Q[t] of degrees 20 and 30, respectively, such that if t € Q is 


not in the finite set W—!(Xg(Q)—Yx(Q)) then Y(t) € Yz(Q) is represented 
by the elliptic curve 


Ex: y? = 2° + fe(t)z + galt). 


In particular for every sucht, pa. 5 = pes. 
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Similarly there are open curves Y, and Yy which parametrize isomor- 
phism classes of triples (E’, @, C3) and (E’, ¢, {C3, C3}), respectively, where 
@ E’ is an elliptic curve, 
@ ¢: E'[5] — E[5] is an isomorphism taking the Weil pairing on E’(5] 
to that on [5], 
2 C3 and C4 are (distinct) subgroups of order 3 of E’(3). 
These curves come equipped with natural (forgetful) maps to Yr. 


Lemma 12. With notation as above, Yz(Q) and Y#(Q) are finite. 


Proof. Let Xp, and X%, denote the compactifications of Y£ and Y#, re- 
spectively. The classical theory of modular curves shows that X},(C) = 
A/(0(5) AT 9(3)) and X%(C) = H/(P(5) NT spue(3)). Thus by Lemma 8, 
Xp and X% are both curves of genus greater than 1, and the lemma follows 
from Faltings’ Theorem (the Mordell Conjecture). O 


3. PROOF OF THE IRREDUCIBILITY THEOREM (THEOREM 1) 


Proposition 13. Suppose E is an elliptic curve over Q such that 

® pr3 restricted to Gawv=3 is not absolutely irreducible, 

® PEs ts reducible. 
Then 
j(E) € {—25/2, —572413/2?, —5.293 /2°, 5-211°/2!°, 119 /2%, —293413/2)°}. 
Proof. Suppose Cs is a nontrivial subgroup of E[5] stable under Ga. 
Case I: f,3 is reducible. In this case E[3] has a proper Gq-stable subgroup 


C3, and then (£, C3+(Cs) represents a rational point of X9(15). By Lemma 
Q(iii), (EZ) € {-25/2, —522413/29, —5-299 /2°, 5-2113/2}>}. 


Case II: pz.3 is irreducible. By Lemma 6, E[3] has two distinct subgroups 
C3, C4 of order 3 such that (E, Cs, {C3, C4}) represents a rational point of 
Xo spit (5,3). By Lemma 10(iii), 7(EZ) € {118/23 —-299413/25}. oO 


Proof of Theorem 1. Theorem 1 is now immediate from Propositions 7 and 
13 and Lemma 9(iv). O 
4. PROOF OF THE MODULARITY THEOREM (THEOREM 2) 


Fix an elliptic curve E over Q which is semistable at 3 and such that 
PE‘5 is irreducible. We want to show that fz5 is modular. 
Let the notation be as in §2.3 with p = 5. Define a subset B of X_(Q) 
by 
B= (Xz(Q) — Yx(Q)) U image(¥p(Q) — Yz(Q)) 
U image(Y¢(Q) — Ye(Q)) 


MODULARITY OF Mop 5 REPRESENTATIONS 471 


and let T = ~—'(B) c P!(Q). By Lemma 12, T is a finite set. 

Suppose ¢ € Q — T, and let E’ be the elliptic curve y? = 2° + fa(t)r + 
ge(t) from Proposition 11. Since w(t) is not the image of a rational point 
of Yz or of Yz, Proposition 6 shows that pzy3 restricted to Gg(=a is 
absolutely irreducible. Further, since F is semistable at 3, Proposition 7.1 
of [15] shows that E’ is semistable at 3 as well. Therefore by Theorems B 
and C applied with p = 3, E’ is modular, and so fg5 is modular. By 
Proposition 11, pg 5 = Pes, SO Pz,5 is modular. C 


5. Mop 5 REPRESENTATIONS AND ELLIPTIC CURVES 


Fix a prime p > 2 and a representation p : Gq — GLo(F,) such that 
det(o) is the cyclotomic character xp. Equivalently, fix a 2-dimensional 
F,-vector space V(p) with a Gq-action such that A?V(p) is isomorphic to 
H, as a Gq-module. 

Let V, = Z/pZ x w,. There is a canonical identification A*V, = py. 
Define 

SL(Vp) = {# € Aut(V,) : det(w) = 1 on A?Vp}. 
Fix an F,-vector space isomorphism y : Vp — V(p). 


Lemma 14. The map o ++ yg! 0 gy” is a one-cocycle on Ga with values 


in SL(Vp). The class c € H'(Q,SL(V,)) of this cocycle depends only on 
the tsomorphism class of p, and not on the choice of p. 


Proof. The fact that p~' oy? € SL(V,) follows from the assumption on 
det(o). The other assertions of the lemma are all easy computations. (Note 
that we are using non-commutative Galois cohomology [8].) C 


Let X, be the modular curve defined in §2.3. Let € be the universal 
elliptic curve over X, (see for example [12] or [14]), ands: X, — €, the 
zero section. Finally, define the line bundle £, to be the pull-back along 1 
of the cotangent bundle of E€, over Xp. 

Define Autg(€,) to be the isomorphisms from €, to itself which map the 
image of the zero section 1(X,) to itself. There are natural Gq-equivariant 
maps 

SL(V,) — Auto(E,) — Aut(L,) — Aut(X,) 
and we apply these maps to the class c of Lemma 14 
a FMQ,8U(%)) > HQ, Autg(Lp)) + H(Q, Auta (%)) 
c b> Ce > CX. 

Define np = det(y~*) : A?V(p) > py. Let X(p) denote the modu- 
lar curve Xy,) of §2.3. Then X(p) contains an open curve Y(o) whose 
points correspond to isomorphism classes of pairs (E, ¢) where E is an el- 
liptic curve and ¢: E[p] — V(p) is an isomorphism such that 7, 0 det(¢) : 
A? E[p] > A?V(p) > My is the Weil pairing. In particular (see for example 
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Proposition 2.1 of [7]) a point of Y(p)(Q) gives an elliptic curve E over Q 
such that pry = p. 

As explained in [15] §2 and [7] Remark 2.4, X(p) is the twist of X, by 
cx in the sense of [8]. Now suppose p = 3 or 5, so Xp & P? over Q. To 
prove Theorem 4, we need only show that Y(p)(Q) is nonempty. We will 
do this by showing that cx = 0, so X(p) = Xp & P' over Q. For a related 
discussion see [10]. 


Lemma 15. Suppose p = 3 or 5, and fiz an isomorphism X, = P! over Q. 
(i) Autg (Xp) ¥ PGL2(Q) = PSL2(Q). 
(ii) Autg(Lp) = GLe(Q)/u,,, where 


1 ifp=3 
m= 
S oS 5 


and pt, lies inside GLo(Q) as scalars. 


Proof. Assertion (i) is well-known. 

By [4] §12.1, the degree of the line bundle £, is #(PSLo(F,))/12 = 
m. Since Xp = P!, L, is determined by its degree so we can identify 
H°(X>p, Lp), the space of global sections of £,, with the space of homoge- 
neous polynomials of degree m in two variables. Further £, is generated 
by its global sections, so this identification gives 


Aut(Lp) ¥ Aut(H°(Xp,L,)) & GLa/thm 
where (22) € Glo sends a polynomial f(x,y) to f(az + by,cx+dy). O 


Proof of Theorem 3. To simplify notation we give the proof in the case that 
F = Q. The case of an arbitrary field of characteristic different from p is 
identical. 

Let p = 3 or 5 and m = 1 or 5, respectively, as in Lemma 15(ii). Let 
p: Gq — GLo(F,) be a representation whose determinant is yp and let 
cx and c¢ be as above. By Lemma 15, there are exact sequences 


0 — +1 = SL2(Q) — Auto (Xp) — OQ, 
0 —> pw, — GL(Q) — Autg(Lp) — 0. 


Since H!(Q, GL2(Q)) = H'(Q,SL2(Q)) = 0 (see [8] IIT Lemme 1), the 
induced long exact sequences from Galois cohomology show that cx € 
H'(Q, Autg(Xp)) is killed by 2 and cc € H1(Q, Autg(Lp)) is killed by m. 
But m is odd, and cx is the image of cc under (1), so cx = 0. Therefore 
X(p) is isomorphic over Q to P', so Y(p)(Q) is infinite and there are 
(infinitely many) elliptic curves E over Q with pg,p isomorphic top. 


Remark . The above proof of Theorem 3 relies heavily on the fact that Xp 
has genus zero, which is true for p = 3 and 5 but false for p > 5. However, 
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one can replace p by 4 and define X4 corresponding to Vz = Z/4Z x pr, (see 
[15]). Then X4 has genus zero, and one can define €4, and £, and apply the 
method of the proof above. But the corresponding integer m of Lemma 
15(ii) is #(PSL2(Z/4Z))/12 = 2, so we cannot conclude that cx = 0. 

In fact, it is not true that every representation p : Gq — Gl2(Z/4Z) 
with determiant equal to the mod 4 cyclotomic character is isomorphic to 
the representation of Gg on E[4] for some elliptic curve E over Q. For 
example, let p be the nontrivial representation of Gq into {1,(3%)} Cc 
GL,(Z/4Z) factoring through Gal(Q(z)/Q). Then det(p) is the mod 4 
cyclotomic character, but if o came from an elliptic curve E we would have 
E(R)[4] = Z/2Z x Z/2Z which is impossible. 
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AN EXTENSION OF WILES’ RESULTS 


FRED DIAMOND 


1. INTRODUCTION 


Suppose that £ is an elliptic curve defined over Q. We wish to prove 
that F is modular, or equivalently, that the associated @-adic representation 


Pp: Go _ GLo(Ze) 


is modular for some prime . 

If we are assuming that E is semistable, i.e., has square-free conductor, 
then we can impose some convenient hypotheses on the local behavior of 
the Galois representations p we consider. By “local behavior,” we mean 
the behavior of the representation 


@: Gp _ GL (Ze) 


defined by restricting p to a decomposition group at p. 

Recall that if E has good reduction at p, then @ is unramified. If E 
has multiplicative reduction at p, then a convenient description of @ results 
from the Tate parametrization of FE (§17 of [S]). In particular, we see that 


1 
Aln~ (4g -) 


To consider elliptic curves with additive reduction at some primes p + 2, 
we must allow more general types of 8. We can actually consider represen- 
tations p with arbitrary local behavior at primes p # £. This is carried out 
in [D] where, building on the work of Wiles [W] and Taylor-Wiles [TW], 
we prove a result of the form 


(1) fp modular => p modular , 


and deduce 


Theorem 1.1. If E has good or multiplicative reduction at 3 and 5, then 
EF is modular. 


The details of the proof can be found in [D]. Here we give an exposition 
which we hope is more motivated and systematic. We often follow [DDT], 
admitting results which are straightforward generalizations of those there 
or elsewhere in this volume. For the proofs of some of the key lemmas, we 
refer completely to [D]. 
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This article is structured as follows: 
Some background is given in §2 on local Galois representations 


g2 Ga iCia(h); 


where p + £ and k is an algebraic closure of Fy. A classification of the pos- 
sible o, though not logically necessary for the proof of theorem 1.1, helps 
provide some insight into local Galois representations and their deforma- 
tions. (The appendix with K. Kramer determines precisely how local Galois 
representations arising from elliptic curves fit into this classification.) 

In §3 we explain what it means for a deformation of o to be “minimally 
ramified” at p. 

Suppose that ¢ is odd-and 


p: Go — GLo(k) 


is an irreducible representation which is semistable at &. We formulate in 
84 a certain deformation problem for each finite set of primes %. This 
deformation problem turns out to be representable by a ring Ry whose 
tangent space is described in terms of Galois cohomology (see [M]). 

Suppose now that fis modular. The goal of §5 is to define a correspond- 
ing Hecke algebra T's, and modular deformation 


Tes Go _ GL (Ts) 


arising from a homomomorphism ¢y : Ry — Ty (see [Ri2]). 

If we can show that it is an isomorphism, then we obtain a result of the 
form (1) as a corollary. The main results are stated in §6. 

To prove that dy is an isomorphism, we must modify some of the tech- 
niques used in [W] and [TW]. In particular, the analysis of the Hecke 
algebras becomes more difficult. In our sketch of the proof in §7, we indi- 
cate where the complications arise, but give only a rough idea of how they 
are dealt with in [D]. 


2. LOCAL REPRESENTATIONS MOD & 


Suppose that p is a prime. We let Gp = Gal(Q,/ Q,) and let I, denote 
the inertia subgroup of Gp. Suppose that @ is an odd prime different from 
p and consider continuous representations 


Oo. Gp _— GL2(k), 
where k& is an algebraic closure of Fz. We let a denote the associated 
projective representation. 


We let x denote the cyclotomic character Gp — k*. Note that x is 
nontrivial if and only if p # 1 mod @. In that case, we write sp. for the 
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representation 


(2) eae) 


where u is a cocycle representing the image of a uniformizer under the 
Kummer map 

Qe = QF /(Qx)’ > H*(Gp, k(1)). 
The equivalence class is independent of the choice of uniformizer and co- 
cycle. (See the proof of proposition 2.2 below.) 

If ~ is a character G, — k*, then we write k(w) for the one-dimensional 
vector space over k on which G, acts via w. Recall that two representations 
oi and op are called twist-equivalent if 0, is equivalent to w @ a2 for some 
character w: Gp — k*. 

We classify o according to the following four types of behavior (principal, 
special, vexing or harmless). 

P : a is reducible and a|;, is decomposable. 
S : o is reducible and o|;, is indecomposable. 
V : a is irreducible and o|;, is reducible. 

H : a is irreducible and o|,, is irreducible. 


Proposition 2.1. The following are equivalent: 


1. o is reducible and o|;, 1s decomposable. 
2. o is twist-equivalent to a representation either of the form 


(a) ( x ' ) for some character w, or 


(b) ; : ) for some additive unramified character ¢. 


3. Either G(Gp) is cyclic of order not divisible by £, or it has order £ 
and G(Ip) is trivial. 
4. G(Gp) is cyclic and the order of a(Ip) is not divisible by £. 


Proof: Suppose 1 holds. Then a is twist-equivalent to a representation 


of the form 
yp u 
061 


for some character w, where u is a cocycle representing a class 
a € H*(Gp, k(y)). 
If o is indecomposable, then z is nontrivial. On the other hand, the image 
of z in H1(Ip,k(w)) vanishes, so x is in the image of H*(G)/Ip, k(W)?*). 
This last group is trivial unless w is trivial, so 2 follows. 
The implications 2 > 3 > 4 => 1 are clear. 


Proposition 2.2. The following are equivalent: 
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1. o is reducible and o|z, 1s indecomposable. 
2. p=1mod #2 anda is twist-equivalent to a representation of the form 


(0) 


for some additive ramified character ¢, or p # 1modé@ and oa is 
twist-equivalent to Spo. 

3. p = 1mod £, G(Ip) has order £ and G(Gp) has order dividing , or 
p#1mod £, G(Ip) has order £ and G(G,) has order dé where d is the 
order of p in F;. 

4. G(Ip) is cyclic of order divisible by 2. 


Proof: Suppose 1 holds. Then o is twist-equivalent to a representation 


of the form 
pu 
0 1 


for some character w where u is a cocycle representing a class 
z € H'(Gp, k(w)). 


Since o|;, is indecomposable, the image of x in H'(Ip, k(w))°? does not 
vanish. This group is isomorphic to Homa, (4e(Qp), k(w)), which vanishes 
unless ~ = x. Moreover if x is non-trivial, then H1(G,,k(1)) is one 
dimensional over k, so 2 follows. 

The implications 2 = 3 = 4 are clear. If 4 holds, then 1 follows from 
the fact that o(G,) is contained in the normalizer of the ¢-Sylow subgroup 
of o(Ip). 


Proposition 2.3. The following are equivalent: 
1. o is irreducible and o|;, is reducible. 


2. o is equivalent to a representation of the form Ind an, where M is 
the unramified quadratic extension of Q, and € is a character of Gy 
not equal to its conjugate under the action of Gal (M/Q,). 

3. G(Ip) is cyclic of order not divisible by £, and G(Gp) is dthedral of 
twice that order. 

4. G(Ip) ts cyclic of order not divisible by 2, and (Gp) is not cyclic. 


Proof: Suppose that 1 holds. Consider the action of G, on P1(k) gotten 
from o. Note first that o(J,) is nontrivial. Let S denote the set of elements 
in P1(k) fixed by Ip. Since oly, is reducible, S is not empty. Since o 
is irreducible, S has no elements fixed by G, and it follows that S has 
exactly two elements. Moreover G, acts transitively on S via the unramified 
quadratic character, so 2 holds. 

Suppose next that 2 holds. Then o(G,) is a dihedral group in which 
(Gy) is a cyclic subgroup of index two and order not divisible by 2. Since 
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M* = Q*Ox,, we see from local class field theory that 


€"E'(Gau) = €'€'(Ip); 


where &’ is the conjugate of €, and 3 follows. The implication 3 = 4 is 
clear, and 4 = 1 follows from the converse of the corresponding one in 
Proposition 2.1. 


Proposition 2.4. The following are equivalent: 


1. alr, is irreducible. 


2. p is odd and o ts equivalent to a representation of the form Ind aoe. 
where M is a ramified quadratic extension of Q, and € ts a character 
of Guy whose restriction to Inq is not equal to its conjugate under the 
action of Gal(M/Q,), or p = 2 and the restriction of o to the wild 
inertia subgroup of Gp is irreducible. 

3. O(Ip) is dihedral of order 2p” for some r > 1 and G(Gp) is dihedral of 
order dividing 4p", or p = 2, G(Ip) (respectively G(Gp)) ts isomorphic 
to Da (respectively As), Aa (respectively Ay) or Aa (respectively S4). 

4, G(Ip) is not cyclic. 


Proof: Suppose that 1 holds and furthermore that o|p, is irreducible, 
where P, is the wild inertia subgroup of J,. Consider the action of G, 
on P+(k) gotten from &. Since &(Ip) is not cyclic, we see that o(P,) is 
nontrivial. Let S denote the set of elements in P'(k) fixed by P,. Then S 
is not empty and has no elements fixed by J,. It follows that S has exactly 
two elements and that I, acts transitively. Therefore p is odd and Gy, acts 
transitively on S via a ramified quadratic character. We deduce that 2 
holds, where M is the corresponding quadratic extension of Q,. (We have 
that € # &’ on P,, hence on Ij.) 

Suppose now that 2 holds. First consider the case of odd p. Then o(G,) 
(respectively, (Ip)) is dihedral, and (Gaz) (respectively, (yz )) is a cyclic 
subgroup of index two. Letting U denote the kernel of the reduction map 
on Ox,, we have QXU = Q*O7, has index two in M™. From local class 
field theory it follows that ¢(I;z) = ¢(Pm) has p-power order and index at. 
most two in G(Gyz). We conclude that 3 holds. 

In the case of p = 2, we see that D = a(F,) is dihedral, since it is not 
cyclic and is a finite subgroup of PGL2(k) of 2-power order. Furthermore 
a(G) is contained in the normalizer of D. If D has order greater than 4, 
the normalizer is dihedral and we may use the same argument as in the 
case of odd p. If D has order 4, then the normalizer is isomorphic to Sa, 
and 4 follows. 

The implication 3 = 4 is clear, as is 4 > 1 (in view of the preceding 
propositions). 
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3. MINIMALLY RAMIFIED LIFTINGS 


For a fixed representation 
a: Gp — GLo(k), 
we consider liftings 
6: Go 2 GL (R) 
of o, where R is a complete local Noetherian W(k)-algebra with residue 
field k. We shall now say what it means for @ to be minimally ramified. 
We use~ to denote composition with the Teichmiiller lift 
k* — W(k)* — R%*. 
Definition 3.1. 
1. Ifo is of type P or V , then 


ol, ~( 4 a 


and we say @ is minimally ramified if 


~(& 9 
alr, & 2 ). 


2. Ifo is of type S , then 


1 x 
ol, ~€@( 4 )s 


and we say @ is minimally ramified if 


= 1 x 
al, ~£@( 4 ae 


3. If o is of type H , then we say @ is minimally ramified if det |r, is 
the Teichmiiller lift of det o|;,. 


Remark 3.2. First note that if x is a character of Gp — k*, then 0 isa 
minimally ramified lifting of o if and only if ¥ ®@ @ is a minimally ramified 
lifting of x @oa. 


Remark 3.3. Ifa is of type P , then it has a twist which is either unram- 
ified or of type B in the terminology of [W]. Note that if o is unramified, 
then @ is minimally ramified if and only if @ is unramified. 


Remark 3.4. If o is of type S , then it has a twist of type A in the 
terminology of [W]. Recall that if @ arises from the #adic Tate module of 
an elliptic curve E over Q, with split multiplicative reduction, then 6], 
1 x 
(01 
ramified if and only if o is ramified if and only if v,(Azg) is divisible by @. 


is equivalent to a representation of the form . It is minimally 


AN EXTENSION OF WILES’ RESULTS 481 


Remark 3.5. Suppose now that ao is type V . If p 4 —1 mod @, then a is 
type C in the terminology of [W]. Suppose instead that p = —1 mod @ and 
write 0 = Ind ae as in proposition 2.3. Let uw: Gy; — O* be a ramified 
character of Gy, of &-power order. Then 


ess 

(3) @=Indgr hu 

is a lifting of o which is not minimally ramified. 

Remark 3.6. Now consider o of type H . Suppose that 
alr, ~ Ind? € 

as in-proposition 2.4. ‘Fhen @ is minimally ramified if and only if 
A|r, ~ Ind 7? é. 


Remark 3.7. Suppose that 6: Gp, — GL2(Q) is a minimally ramified lift- 
ing of 0, where O is the ring of integers of a finite extension of the field of 
fractions of W(k). Then det @|;, is the Teichmiller lift of deto|;, and the 
Artin conductors of @ and o coincide. In [W] and [TW] a technical hypoth- 
esis is imposed to ensure that a partial converse holds. This hypothesis 
rules out the existence of liftings as in (3) and facilitates the characteri- 
zation of the modular forms which give rise to minimally ramified liftings. 
The main contribution of [D] is to dispense with that hypothesis. 


4. UNIVERSAL DEFORMATION RINGS 
Now consider an irreducible representation 


For each prime p we fix an embedding of Q in Q> and regard Gp as a 
decomposition group in Gg. We suppose that plc, is semistable in the 
sense of [DDT], section 2.4. 

Suppose that K is a finite extension of the field of fractions of W(k). Let 
© denote the integral closure of W(k) in K; thus O is a complete discrete 
valuation ring with residue field k. We consider liftings of p of the form 


p: Gg — GL2(R), 


where RF is in the category C of local complete O-algebras with residue 
field k. A deformation of p is an isomorphism class of such liftings (see 
[dSL] (2.1), (2.2)). 

If & is a finite set of primes, we say that p is type » if 

1. x,‘ det p has finite order not divisible by &; 

2. p is minimally ramified outside ; 

3. p is semistable at 2 in the sense of [DDT]. 
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The notion depends only on the isomorphism class of p and is independent 
of the choice of embeddings of Q in Q,p. 

Consider the functor which associates to R the set of deformations of 
p of type &. The type & restriction satisfies the conditions listed at the 
beginning of §6 of [dSL] (see also §29 of [M] and §2.4 of [DDT]). From [dSL] 
(2.4) and (6.1) we conclude that the functor is represented by a complete 
local O-algebra Ry, the identity map of Ry corresponding to the universal 
deformation of type =: 

parv : Gg — Gle(Rz). 
Suppose now that we are given a lifting 
p: Ga — GL2(0) 
of type &. The universal property of Ry yields a surjective morphism 
w:Rys —O 
such that p is equivalent to the pushforward of p¥v. Let p denote the 
kernel of 7. We define the group 
H3(Gq, (ad) @ (K/O)) 
as in §2.7 of [DDT]. A generalization of results of Mazur (see §23-25 of 
[M]) yields a canonical isomorphism 
(4) Homo (p/p?, K/O) & H4(Ga, (ad"p) ®o (K/O)). 
5. HECKE ALGEBRAS 


Recall that given a newform 


Fr) = Do an(fe?"""" 


of weight 2, level Ny and character wy, a construction of Eichler and 
Shimura (see [Ro]) associates to f a continuous representation 


pz: Ga — GLa(Qe), 


where we have fixed embeddings Q — C and Q — Qg. The representation 
py is characterized up to isomorphism by the following property: For all 
primes p not dividing N;?, py is unramified at p and the characteristic 
polynomial of p+ (Frob p) is - 


X? — ap(f)X + vy (P)p. 


We wish to continue working over the ring O introduced above, so we 
also fix an embedding Q, — K and view py as taking values in GLo(K,), 
where Ky is the subfield of K generated by K and the Fourier coefficents 
of f. We denote the ring of integers O;, which we regard as an object of 
C. Define 

ps: Gq > Gla(k) 
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as the semisimplification of the reduction of py. 

We assume that our fixed representation f is modular in the sense that 
it is isomorphic to py for some weight 2 newform f. We let Dy denote the 
set of newforms g such that pg is a deformation of p of type & and N, is 
not divisible by £7. 


Theorem 5.1. If p is modular, then Dg 4 0. 


This is a refinement of Serre’s e-conjecture for which a crucial ingredient 
is Ribet’s theorem [Ril] (see [E]). The result stated here is a consequence 
of [D] which builds on the work Ribet and many others. 

For each g in ®y, we consider the map Ry — Oy corresponding to py. 


We then define 
Ts c |] Og 


geez 
as the image of Ry. Since Ry is topologically generated by traces, we may 
also regard Ty as the O-subalgebra generated by the elements 


Tp = (4p(9) gees 
for primes p not dividing N@. We wish to prove that the surjective map 
dx: Ro Ty 
is an isomorphism. Note that ®y gives rise to a type & deformation 
pees Gog — GLo(Ts) 


of p, such that for each g € ©, the composition with the projection to 
GL2(O,) is equivalent to pg. 

For finite sets of primes & > O, there is a natural surjective homomor- 
phism Ry — Re defined by regarding p28” as a deformation of f of type 
Xi. We have also the natural surjection Ty — To so that the diagram 


Ry ox Ts 
(5) | | 
Re: 2S. as 


commutes. 


6. THE MAIN RESULTS 


Recall our assumption that @ is odd and # is semistable at 2. We let 
L = Q(veé), where « = (—1)-)/2, We suppose that © is an arbitrary 
finite set of primes. The main result is the following: 


Theorem 6.1. If plc, is irreducible, then dy is an isomorphism and Ty 
ts a complete intersection. 


We shall sketch the proof below referring to [D] for the full details. 
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Corollary 6.2. Suppose that p: Gq — Gla(Q) ts continuous and unram- 
ified outside a finite set of primes. Suppose that p is semistable at £ and 
pla, ts irreducible. If p is modular, then p is modular. 


Applying the Langlands-Tunnell theorem (see [G]) as in [Ru], we con- 
clude: 


Corollary 6.3. Suppose that EF is an elliptic curve over Q with good or 
multiplicative reduction at 3, and that [Q(E[3]) : Q] = 16 or 48. Then E 
is modular. 


We refer to [Ru] for the deduction of theorem 1.1 from corollaries 6.2 
and 6.3. 


7. SKETCH OF PROOF 


7.1. Vague principle. A formulation of the problem such as theorem 6.1 
enables us to use tools from commutative algebra. We shall use infor- 
mation about the vertical maps in (5) and one of the horizontal maps to 
prove that the other horizontal map is an isomorphism. The information 
about Ry — Re comes from the description of tangent spaces in terms of 
Galois cohomology (4); the information about Ty — To comes from the 
connection with congruences between modular forms. 


7.2. Some preparation. We begin with two reduction steps and a defi- 
nition. 

One can check that if ~ is a character Gg — k* unramified outside @, 
then theorem 6.1 holds for f if and only if it holds for p’ = p@ x. Indeed 
if we define df : RL — Th, using f’ instead of f, then we obtain a natural 
commutative diagram 


Ry —— 8&5 
[ox |e 
Ty —— TT. 


We can therefore assume that for each prime p # @ such that plc, is 
reducible (i.e., P or S ), we have p/» # 0. 

We also find that theorem 6.1 is well-behaved under extension of scalars. 
More precisely, suppose that K’ is a finite extension of K. Defining 


/ / / 
dy : Ry > Ty 


using K’ instead of K, we find that there is a natural commutative diagram 
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where 0’ is the ring of integers of K’. One deduces from this that if theorem 
6.1 holds for some K, then it holds for all K. In particular, we may assume 
that there is an O-algebra homomorphism Ty — O. 

In view of remark 3.7, we must exercise extra care with primes p such 
that Alc, is of type V . We denote by P the set of such vexing primes. (In 
[W] and [TW], it is assumed that P consists only of primes which are not 
congruent to —1 mod £.) 


7.3. The case © = 9. Recall that the strategy of Wiles and Taylor-Wiles 
in the “minimal case” is to choose, for each n > 1, a certain set Q = Qn 
consisting of primes congruent to 1 mod £2”. These sets Q are chosen so that 
Rg and Rg can be topologically generated as an O-algebra by r elements, 
where r is the cardinality of @. Moreover the choice is made so that Tg 
and Tg can be related using their natural structure as algebras over a group 
ring where the group is generated by r elements. One then proves gg is an 
isomorphism using the arguments of §3 of [TW] and Chapter 3 of [W], or 
using the Taylor-Wiles-Faltings criterion ([TW], Appendix or [DDT], §3.4). 
Alternatively, using Rubin’s simplification of the isomorphism criterion (see 
[dSRS]), it suffices to choose a single set Q = Q, as in [TW], where n is 
made explicit. 

Our strategy is the same, but the set P introduces several complications. 
A minor complication is that we use a version over O of the isomorphism 
criterion (see §5 of [D]). We shall now state such a version along the lines 
of Rubin’s simplification, leaving it as an exercise to make the necessary 
modifications to the proof of Criterion IT of [dSRS]. 

We fix an integer r > 0 and consider power series rings 


O[[S]] = O[[S1,.-- , Sr] and = O[[X]] = O[[X,... , Xe]. 
Let m denote the maximal ideal of O[[S]]. Recall that the polynomial 


f(t) =|[(@+4 
i=0 


satisfies f(n)/(r + 1)! = length »(O[[S]]/m”) for all integers n > 1. We 
also fix O-algebra homomorphisms 


(6) Ol[S]] > OX] - RT 


with O[[X]] — R and R — T surjective. Suppose that T/(Si,... ,S,)T is 
finitely generated as an O-module; let s denote its rank and t the O-length 
of its torsion. 


Theorem 7.1. Suppose that there are positive integers d and N such that 
1. d>st+s+t, 
2. f(N)+ f(dN — d) — f(dN) > 0, 
3. O[[S]]|/m% — T/m*T is injective. 
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Then 
@ R/(Si,...,5,)R—>T/(S),...,5,-)T is an isomorphism, 
@ T/(S,...,5,)T is a local complete intersection, 


®s>Qandt=0. 


We shall apply the criterion with R = Rg and T = Tg for a certain set 
Q as in [TW]. We shall explain below how r, d, N and Q are to be chosen, 
and the maps in (6) are to be defined. 

For arbitrary &, let Jy denote the kernel of the map Ry — Rg. One can 
check that the kernel of the natural surjection 


Tzs/IzTzs - Tyg 


is torsion. In particular, the rank of Ty/IpT» is independent of 4, and 
we denote it s’. We denote by ¢’ the O-length of the torsion submodule 
of Tp/IpTp. We set d = s/t) + s' +t, r = dim, Hi(Ga,ad°p(1)) and 
choose N so that the inequality of theorem 7.12 is satisfied. (Note that 
f(N) + f(adN — d) — f(dN) is a polynomial with leading term N7*!.) 
By the same Galois cohomology argument as in §4 of [TW] (or see [dSh] 
or [DDT]), we choose a finite set of primes Q such that 
sd #Q = Tr, 
@ Rg can be topologically generated as an O-algebra by r elements, 
@ if g € Q, then the following hold: 
— g=1mody; 
— pis unramified at q; 
— p(Frob,) has distinct eigenvalues. 
Since Rg is generated by r elements as an O-algebra, we can define 
a surjective homomorphism O[[X]] — Rg. Let G denote the maximal 
quotient of [],-9(Z/qZ)* of é-power order. We endow Rpug, hence Rg, 
with the structure of an O[G]-algebra as in [TW], appendix (or see [dSh] 
or [DDT]). Choosing generators 91,... , 9, for g, we define a surjection 


O[[s]] > OG 
Soo ot g-l 
whose kernel is contained in m™. We then define the O-algebra homomor- 
phism O[[S]] — O[[X]] so that the diagram 


O[[S]] ——— O[]] 


Oleg] ——> Ra 
commutes. 
The verification of hypothesis 3 can be viewed as the main obstacle in 
improving the methods of [TW] and [W] to cover the setting of theorem 6.1. 
Recall that Taylor and Wiles use a method of de Shalit to prove that (under 
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their hypotheses), Tg is free over O[G] and Tg/I9gTgq —> Ty (see §2 of 
[TW] or [dSh], or see §4.3 of [DDT] for an alternative argument using the 
g-expansion principle). The key observation made in [D] is that it suffices 
to prove the following: 


Lemma 7.2. There exists a nonzero Tg-module which is free over O[G]. 


The proof of the lemma is very technical and is related to the methods 
of [DT]. We refer the reader to §4 of [D] for details, mentioning here only 
that it relies on the Jacquet-Langlands correspondence and a cohomological 
construction. We also point out that to prove the lemma and other results 
used below on the fine structure of the algebras Ty, one first realizes them 
as completions of Hecke algebras acting on spaces of modular forms. (See 
for example §4.1 and §4.2 of [DDT].) 

To verify that the hypothesis 1 of the theorem is satisfied, one uses that 
O[G] — Rg was defined so that the augmentation ideal of O[G] maps onto 
Ig = ker(Rg — Rg). Thus we have 


s = rank @(To/IgTg) = rank oT, = ranko(Tp/IgTp) = 8’. 
The arguments of [TW] discussed (or [DDT] §4.3) can be used to show that 
the natural map 
(7) Tpug@/(S1,--- ,5r)Ppug > Tp 
is an isomorphism (see [D], lemma 3.3). One then deduces that 


Tpug/Ipugl pug — Te/Ip, 


from which it follows that t < ¢t’ and d < d’. 
We now apply theorem 7.1 to conclude that 


Re/Ig > Te/lIeTe 


is an isomorphism, and these rings are complete intersections and torsion- 
free over O. From this follows theorem 6.1 in the case © = @. 


7.4. The case of arbitrary ©. Our situation now is that we have a 
commutative diagram of surjective O-algebra homomorphisms 


Be ee 


| | 


Rg ae. eae To; 
we know that the bottom row is an isomorphism and the rings are local 
complete intersections, and we wish to prove this holds for the top row. 
Recall that we have assumed the existence of a map Ty — O of O- 
algebras. Such a homomorphism necessarily corresponds to newform f 
with coefficients in O such that ps is a deformation of p of type 9. 
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For arbitrary ©, we write po for the kernel of Roe — O, and Po for 
the kernel of te : Te — O. We consider the O-module Sg = pe/p%, and 
the O-ideal no = mo(Ann7, Be). We omit the subscript O when 9 = 9. 
According to the Wiles-Lenstra criterion, Criterion I of [dSRS], we know 
that 

length o{®) = length »(O/n), 
and we wish to prove that 


length »(®s) < length 9(O/ns). 
Using (4), one obtains as in §4.2 of [Ri2] 


(8) length o(@z) < length o(®)+ S— dp, 
pEx 

where d, is the length of 

e H°(Gp,ad ps @o K/O(1)) ifp £ & 

@ O/(ae(f)? — u;(2) if p = 2 does not divide N;; 

@ 0 otherwise. 
(We have used here that p¢ is of type 9.) 

Using that Ty is Gorenstein for & > P (Wiles’ generalization of results 
of Mazur and others discussed in [Ti]), together with Wiles’ calculations of 
the change in 7 discussed in §4.3 of [Ri2], we find that 


(9) length 9(O/nx) > length g(O/np)+ >> dp, 
pEL—P 
(provided © > P). We complement this with the inequality 
(10) length o(O/np) > length 9(O/n) + >) dp 
pEeP 


established by lemma 3.6 of [D]. 

Applying the Wiles-Lenstra criterion together with (8), (9) and (10), we 
conclude that ®y is an isomorphism if © D> P. (This is all that is proved in 
[D] and all that is needed for the corollaries.) We leave it as an exercise for 
the reader to treat the case of arbitrary © by showing that if Py CPC, 
then || ,<p,(p + 1) is an element of mp(J), where J is the annihilator in 
Ts of the kernel of Ty ~ Ty_p. 
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APPENDIX: CLASSIFICATION OF j;, BY THE 
j-INVARIANT OF E 


FRED DIAMOND AND KENNETH KRAMER 


Let K be a finite extension of Q, with ring of integers Ox and valuation 
ux. Suppose that & is an elliptic curve over K with absolute invariant 7 and 
minimal discriminant A;. Assume throughout that 2 is a prime different 
from p. Let Gx = Gal(K/K) and consider the mod-¢ representation 


PE. F Gr _ Aut(£[@]) Rg GL2(F2). 


In so far as possible, we wish to describe the representation type of Dp» as 
defined in section 2 of [Di] and the conductor of EF in terms of j. We rely 
on observations of Serre [Se2] and on Tate’s algorithm [Ta]. For extensive 
tables of Kodaira reduction types in terms of congruences on the coefficients 
of a generalized Weierstrass model for E, see [Pal]. 

As motivation for this exercise, note that certain calculations of con- 
ductor or representation type have been used to study various Diophantine 
equations ({Da], [Rib]) or to prove the modularity of elliptic curves A de- 
fined over Q ({[DK],[LR]). For example, if A is semistable at 3 and if the 
wild ramification group G, corresponding to the j-invariant of A in Ta- 
ble 3 below is isomorphic to the quaternions Q then A is modular by [Di, 
Cor. 6.3]. 

In case £ has good reduction, fg 2 is unramified with cyclic image, and 
therefore of type P. If E is potentially multiplicative (i.e., if ux (7) < 0), 
then EF acquires multiplicative reduction over at most a quadratic exten- 
sion of K, and the twist of EF by this extension is semistable. Then the 
parametrization of & by p-adic theta functions may be used to determine 
the representation of Gx on the ¢-adic Tate module. (See for example [Si2, 
Chap. V, Prop. 6.1 and ex. 5.13].) In particular, fg» is of type P or type 
S according to whether or not vx(Az) is divisible by 2. 

We now restrict our attention to elliptic curves & with potential good 
reduction. Let @’/ =4 if 2= 2 and @’ = @ otherwise. Then E acquires good 
reduction over any of the division fields K(E[0’]). Indeed, the kernel of Dp » 
restricted to the inertia subgroup Ix of Gx is independent of 2. We denote 
by Go the abstract group defined by the image fp » (Ix). The following 
lemma and the conditions of [Di, Prop. 2.1-2.4], may be used to check the 
extent to which the representation type of J,» also is independent of £. 
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Lemma 0.1. Write o = py, and a for the associated projective represen- 
tation. Then 
1. G(I[x) ts cyche if and only if Go is cyclic. 
2. Suppose that Go is cyclic. Then |a(Ix)| is divisible by 2 if and only 
if |Go| is divisible by £’. 
3. Suppose that Gp is cyclic and |Go| is not divisible by #’. Theno is 
reducible over Fy if and only if E acquires good reduction over a cyclic 
extension of K. 


For 7 #0, 1728, one choice of model with absolute invariant j is given 
by 


(0.2) ? = 23 —3cx —2e, 


with c = j/(j — 1728) and discriminant 12°j?/(j — 1728)°. Upon twisting 
E by a quadratic character w, the discriminant varies by a sixth power. 
Furthermore, f(E”) < max{f(£), 2£(w~)}, with equality if f(£) 4 2f(w). 
The representation type is not affected. If p # 2 and w is ramified, such 
a twist changes Kodaira symbols I,, I, III or IV to I¥, IV*, HI* or II* 
respectively. For a discussion of the effect of a quadratic twist on Kodaira 
symbol and conductor when p = 2, see [Co]. 

As for the elliptic curves with many automorphisms, if 7 = 0, the model 
y?+y = x? has good reduction for p # 3. If 7 = 1728, the model y? = z°—z 
has good reduction for p # 2. Note that for 7 = 0 (resp. 1728), twisting by 
a ramified cubic or sextic (resp. quartic) field may affect the representation 


type. 


Definition. Let 7 € Ox be given. We say that F# is j-minimal if EF has the 
minimal conductor exponent among elliptic curves with absolute invariant 
equal to 7. 


For p > 5, we extract the table below from the well-known table of 
reduction types [Si2, Table 4.1]. The classification of representation types 
follows from tame ramification theory [Se2, §1.3-1.5]. 


K(j — 1728) | uK(j 
moe 2 nod : uvK(Ap) Go 
0 
4o0r8 


oF 
Z/3 
2or 10 | Z/6 


TABLE 1. j-minimal curves over p-adic fields z p> 9d, 
4 #0, 1728 


3 or 9 
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Proposition 0.3. Suppose p # 2 and Gp = Z/4. Then fg o ts of type S. 
For £# 2, pre ts of type P or type V according to whether or not p4C K. 

Suppose p # 3 and Gy © Z/3 or Z/6. Then py is of type S. For £ #3, 
Pre 2s of type P or type V according to whether or not pz C K. 


RESIDUE CHARACTERISTIC p= 3 
The table of 7-minimal curves over Q3 given below may be constructed 
by applying Tate’s algorithm to the generic curve (0.2). For the determi- 
nation of representation types, including cases of more general 3-adic base 
fields, see the subsequent remarks. To reduce the number of entries, we 
have imposed the following convention. 


Convention. In each family of j-minimal curves for fixed j, the table 
includes only the curves & with minimal valuation of discriminant. 


In Table 2, we write v for vg, and j* for 7 — 1728. By P-V we mean 
representation type P if 7* € OF or type V otherwise. By P-S we mean 
type P if 2>5 or type Sif £=2. 


j* /27 = 1(9) 
q* /27 = 4,7 (9) 
1G") =4 
u(j*) = 5 
u(j*) > 6, even 


~ 


aw Ow Dm Ww Ww 
eh 
—_ 
NOW fF Ww NW] |] CO by 
— 


3n,n>2, 7/39" = +4(9) 
j/3°" = +1, +2(9) II 


E E 


TABLE 2. j-minimal curves over Q3 with 7 40,1728 


Suppose that K is a general 3-adic field. If EF is a j7-minimal curve with 
f(E) = 2, then Go ~ Z/4 and the representation type of fg ¢ is given by 
Proposition 0.3. For K = Q3, this includes the 0-minimal curve y? = 2?+1. 

The representation fp» is wildly ramified precisely when pp o(Px) * 
Z/3, where Px is the wild ramification subgroup of Ix. Equivalently, 
£(E) > 3. 
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Proposition 0.4. The following are equivalent: 
1. £(E) is odd and at least 3, 
2. Pre is of type H for some é, or equivalently for all £, 
3. Pg 2x) = Glo(F2), 
4, Go is isomorphic to the semi-direct product Z/3 x Z/4. 


Proof. Write b(#, 2) for the wild conductor exponent ([Se3, §4.9]). Recall 
that f{(E) = 2+ b(@g2) when E has additive reduction. Furthermore 
f(E) = f(g.) when the restriction Pg >| Ix has no non-zero fixed points. 

If (1) is true, b(#g 2) is odd. Then fg » | Ix must be irreducible, so that 
(2) holds. Otherwise, after possible extension of scalars, we have 


Pg,2 Ik ~ ( : eo ) 


and b(fz,2) = b(w) + b(p~*) = 2 b(p) is even. 

Conversely, if (2) holds, then jp o(Gx) = Pp.2(Ix) = GLe(F2). Thus 
Pe is induced from a one-dimensional representation » of Gr, where 
F = K(A¥?) is the quadratic extension of K inside M = K(E[2]). By the 
inductive property of conductors [Sel, VI, §2, Cor. to Prop. 4], we have 
£(E) = £@p2) = f(w) + 1. 

Let us verify that f(wW) is even. According to the notion of conductor 
in abelian class field theory, there is a unit u € F with the property that 
u is not a norm from M and vr(u — 1) = f(w) — 1. The Artin symbol 
s = (u, M/F) provides a generator for Gal(M/F). If £(w) were odd, we 
could find an element 0 € K such that 


vk(@)>0 and vr(u—1-8)) > f(y). 


But then s = (1+ 0, M/F), so that Gal(F’/K) acts trivially on s, a contra- 
diction. The equivalence of (1) and (2) is proved. 
One easily checks the equivalence of conditions (2), (3) and (4). = 


The situation is somewhat more complicated when f(£) is even. By ex- 
amining the 2-division field of the generic curve (0.2), we get a classification 
which may be useful when the size of f(£) is known. 


Corollary 0.5. Suppose that E has absolute invariant j € Ox with 7 # 
0, 1728. 

1. If ux(j — 1728) is odd, then either £(E) = 2, Go & Z/4 and pry is 
described by Proposition 0.3, or else £{(E) is odd, f(E) > 3 and fre 
is of type H. 

2. If ux(j — 1728) is even, then either E achieves good reduction by at 
most a quadratic twist and pp» is of type P, or else f(E) is even, 
£(E) > 4 and pg, ts of type P or type V according to whether or not 
j — 1728 € K?. 
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At least for 7 € Ox with ux(7) #0 (mod 3), wild ramification is assured 
in the £-division fields of H. Indeed, we then have f(£) = ux(j — 1728) +2. 


RESIDUE CHARACTERISTIC p = 2 


The table of 7-minimal curves over Q2 given below may be constructed 
by using Tate’s algorithm on the curve (0.2). Our convention on minimal 
discriminant within a j-minimal family also holds for this table. We denote 
by G, the 2-Sylow (wild ramification) subgroup of the inertia group Go. 
Then G, is isomorphic to a subgroup of the quaternion group Q of order 8. 
We explain the determination of G, and representation type appearing in 
Table 3, as well as other information for more general 2-adic base fields, 
after giving the table. 

In Table 3, we write v for vg, and 7* for 7 — 1728. By P-V we mean 
representation type P if 7/64 = 1 (mod 8) or type V if 7/64 =5 (mod 8). 
By P-S we mean type P if 2 > 5 or type S if 2 = 3. The special cases 
labeled (a) and (b) involve j* = 2?"u with n > 4. Case (a) is defined by 
u = 1 (mod 4) and also includes 7 = 1728. Case (b) is defined by u = 3 
(mod 4). 

We now assume that K is a general 2-adic field and that E has potential 
good reduction. To study the image of inertia under fg, it suffices to 
examine the 3-division field L = K(E[3]). The subfield Lx = K(X(E{3])) 
of L obtained by adjoining to K the z-coordinates of points of order 3 in 
a Weierstrass model contains the field K, = K (403, Ai 2). Furthermore 
Gal(Lx/K\) — Z/2 ® Z/2. Note that Gal(Lx/K) » o(Gx), where ¢ is 
the projective representation derived from fg 3. 

We have Gp = G, ifux(Ag) = 0 (mod 3) and Go/G, = 2/3 otherwise. 
In particular, the representations Pz, are tamely ramified, ie., {(E) = 
2, precisely when Gp © Z/3. In that case, the representation type is 
given by Proposition 0.3. The representations pg , are of type H precisely 
when G, ~ Q and this is equivalent to Lx/K, being totally ramified of 
degree 4. Furthermore, type H is guaranteed when f(£) > 3 and odd by 
the argument in the proof of Proposition 0.4. 

The next lemma, which is valid over any field of characteristic not 2 or 
3, may be checked by consideration of generic 7 as in [Ig, p. 456, p. 461, 
Thm. 3, Thm. 6] or by direct computation. We shall use it to examine 
ramification in Ly /K,. For another approach, including a detailed study of 
various Galois extensions of Qo, see the thesis of A. Rio [Rio]. In particular, 
the representations of type V occurring over Q2 have also been determined 
in [LR]. 


Lemma 0.6. Let ¢ be a primitive cube root of unity. For 7 #4 1728, the 
field Lx may be obtained from K, by adjoining the square roots of the 


496 F. DIAMOND AND K. KRAMER 


Lc, | mba | (ae r)| “ie 
v(j) G | Symbol | (Az) | £2) | Fay 
Eon Oe 
ee es ae ee 
4, j/16 = 1(4) 8) Tj 8 3 H 
ea ni Ps 
uj*)=7 Z/4| 11 9 8 | P-V 
case (a) above | 2 Tr |} 6 oe) THe 7 
case (b) above | Q 3 12 5 H 
u(j*) >9,odd | Q Ill 9 8 H 
m [3s ~>r| 
8, j/26=1(44) | a | I 4 3 H 
isesa) [o| w | 4 
eae lB a ET 
A SE RE = 
in, nS 4 orj=0] 0 a 
en ee CO 
fang3, nse PO 


TABLE 3. j-minimal curves over Qe 


Kummer generators 


Ko = (Cae — 12¢)(91/8 21207), 
Ky = (ji/% —12)(51/3 — 120), 
kg = (g1/3 — 12)(97/8 — 12¢7). 


Note that the relation kok, k2 = (j — 1728)? € K? holds. 


Proposition 0.7. Let E be a j-minimal curve with 7 #0. Write ex = 
uK(2). 
1. Ifux(j) =0, then E has good reduction. 
2. Suppose that ux(j) > 12ex. If ux(j) =0 (mod 3), then E has good 
reduction. Otherwise, Pg ¢ is tamely ramified. 
3. If 0 <uK(j) < lex and ux(j) is odd, then Gy ® Q and pry ts of 
type H. 
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Proof. For the special case vx(j) = 12ex, the element a = 64(1728 — 7) /j 
is a unit in K. The curve y”? + ay = z? + 3az has absolute invariant equal 
to j and good reduction. 

If vx(j) = 0, consider the model y? + zy = 2° — 4b — b, with A = 
b(1 — 64b)? and absolute invariant (1+ 192b)?/A. Hensel’s lemma shows 
that if 7 € Ox is a unit, then there exists a unit root r € K of the equation 
(X + 192)? — j(X — 64)? = 0. Setting b = 1/r provides a curve with good 
reduction and absolute invariant equal to 7. 

In the remaining cases, one determines the ramification in Lx/K, by 
examining the Kummer generators of Lemma 0.6. We omit details here, 
but give a similar argument in the next proposition. = 


Proposition 0.8. An elliptic curve E has maximal conductor exponent, 
namely £(E) = 6ex + 2, in precisely the following cases: 
1. 7 = 17287 with JE Ox and uK(Jd — 1) odd, 
2. j = 1728 and E has a model of the form y* = x° — az with vx (a) 
odd. 


Proof. It follows from [Sel, Chap. IV, §2, ex. 3] and [BK, Prop. 3.7] that 
{(E) = 6ex +2 precisely when there exists an element with odd valuation 
among the Kummer generators for Lx/K,. By examining the generators 
of Lemma 0.6, we find that this forces vx(7) = 6ex, in which case we 
may write 7 = 1728J for some unit J € Ox. Assume for the moment 
that J #1. Then our Kummer generators have even valuation 4ex unless 
ux(J —1)> 0. Suppose that indeed ux(J —1) > 0. It follows that J is a 
cube, say J = (1+ )° with s € K and ux(s) > 1. Then we have 


(0.9) Ky 144s[(1 — €) +s] = 144¢7s[1 + (s + 2)¢], 
ko = 144(3+3s +57). 


Clearly vx(Ko) = 4ex is even. Furthermore, vx(«1) is odd if and only if 
ux(s) is odd. Case (1) arises in this way. 

The most general curve with 7 = 1728 has a model y? = 2x — az, 
with non-zero a € K, for which Ly = K(pyo,./a(1 + 2C)). But then the 
extension Ly /K(pz) is generated by square roots of units of K (yz) unless 
ux(a) is odd. Case (2) arises in this way. = 


It remains to find the representation type for 7-minimal curves E hav- 
ing even conductor exponent f(£) > 4. When the base field is Qa, it then 
follows from the list of conductors in Table 3, as computed by Tate’s algo- 
rithm, that f(Z) = 8 and j = 1728J with u(J — 1) being odd. Let G be 
the image off, and write Dg for the dihedral group of order 8. 


Lemma 0.10. Suppose j € Z2 has the form j = 1728J with u(J —1) odd. 
1. Ifu(J —1) >3, then G, © Q and pry is of type H. 
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2. Ifu(J—1) = 1, then G, = 4/4. Furthermore, either G = Z/8 and 
Pre ts of type P, or else G & Dg and pg is of type V according as 
J =3 (mod 8) or J=7 (mod 8). 


Proof. Under these hypotheses, J is a cube, say J = (1+8)* as in the no- 
tation of the proof of Proposition 0.8. Then Ky = Qo(p3, 71/9) = Qo(p3). 
In case (1), we have v(s) > 3. From the generators in (0.9) we see that 
Lx /K, is totally ramified of degree 4. Thus Gy = Q. 

In case (2), the following congruences hold modulo 8: either 


s=2 and #9/144=-3, or else s=-—2 and #9/144=1, 


according as J = 3 or J = 7. For either possibility, we have Lx = K1(4/Kj). 
The extension Lx /K, clearly is ramified. It follows that G, ~ Z/4. If 
@ denotes a generator for Gal(K1/Qz2), we have K,4° € Ko(QX)?. Thus 
Gal(Lx /Qz2) » Z/26Z/2 or Z/4 depending on whether or not Ko € (Q3)?. 
Then the claimed structure of G follows accordingly. = 
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CLASS FIELD THEORY AND THE FIRST 
CASE OF FERMAT’S LAST THEOREM 


H.W. LENsSTRA, JR. AND P. STEVENHAGEN 


For a prime number p, the first case of Fermat’s last theorem for expo- 
nent p asserts that for any three integers z, y, z with zc? + y? + 2? = 0, 
at least one of xz, y, z is divisible by p. In the present chapter we use 
class field theory to prove several classical results concerning the first case. 
Our treatment is based on Hasse’s exposition [6, Section 22], but whereas 
Hasse applied explicit reciprocity laws, our proofs depend only on general 
properties of power and norm residue symbols. 


Theorem 1. The first case of Fermat’s last theorem with exponent p is 
correct for each prime number p for which 2p + 1 is prime. 


This theorem is due to Sophie Germain (1823). 

For a positive integer k, we define N, = IL,.3 (l+7+4%), the product 
ranging over all kth roots of unity 7 and ? in an algebraic closure of the 
field Q of rational numbers. It is easy to see that N;, is a rational integer 
for each k, and that N;, vanishes if and only if k is divisible by 3. 


Theorem 2. Let p be a prime number, and suppose that there exists a 
positive integer k not divisible by p for which kp+1 is a prime number not 
dividing N;,. Then the first case of Fermat’s last theorem with exponent p 
is correct. 


This result, which is similar to a theorem of Wendt (1894), is taken 
from [1]. The integer k is necessarily even and not divisible by 3. 

Let k be a positive integer, and let 7; be the set of odd primes p for 
which p divides k or kp + 1 is a prime factor of Ny. By Theorem 2, the 
first case of Fermat’s last theorem is correct for exponent p if p is a prime 
number not in TJ; for which kp +1 is prime. When k is not divisible by 3, 
the estimate |.NV;| < 3" shows that the exceptional set T;, has cardinality 
at most k? + log k. 

In 1985, Adleman, Heath-Brown, and Fouvry [1, 4] deduced from Theo- 
rem 2 that the first case is valid for infinitely many p, as follows. Using sieve 
methods, Fouvry showed that there exists c > 0 with the following prop- 
erty: for all sufficiently large ¢, there are at least c-t/logt prime numbers 
q <t with g = 2 mod 3 for which g—1 has a prime factor p > t°-°°8”. Each 
pair q, p gives rise to an integer k = (q— 1)/p that is less than u = ¢°-3313. 
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The inequality c-t/logt > u- (u? +logu), which is valid when ¢ is large 
enough, shows that some value of k must arise for more than k? + logk 
pairs g, p. For at least one of these pairs the number p is outside 7, so 
that the first case holds for p. 

From Nz = —3 one finds that T> is empty, so Theorem 1 follows from 
Theorem 2, with k = 2. In general, when & is a given positive integer that 
is not divisible by 3, then it is usually easy to deduce from Theorem 2 that 
the first case of Fermat’s last theorem is correct for each prime exponent p 
for which kp + 1 is prime. For example, from 


Ng = —3-5°, Nga 3" 25P 2173; Ny = —3- 119 - 313 


one finds T, = Tg = 9 and Tio = {3,5}. Since Theorem 1 applies to p = 3 
and to p = 5, one concludes that the first case is true for p if 4p + 1, 
8p+1, or 10p+1 is a prime number. This result is due to Legendre (1823). 
Exceptional primes p that may arise for other values of k are generally 
easily dealt: with by means of the following theorem. 


Theorem 3. Let p be a prime number, and suppose that the first case of 
Fermat’s last theorem for exponent p is false. Then we have 

(a) 2?-1=1 mod p’, 

(b) 3?-!=1 mod p?. 


These two results are due to Wieferich (1909) and Mirimanoff (1910), 
respectively. 

There is an efficient algorithm that for a given prime number p tests 
the validity of (a) and (b). It is believed that there is not a single prime p 
satisfying both (a) and (b), so that this algorithm, combined with Theorem 
3, could be used to prove the first case of Fermat’s last theorem for any 
given prime exponent. This belief is borne out by numerical experiments. 
In fact, of all primes for which (a) has ever been tested—and this includes 
all primes less than 4-10!" (see [3])—only p = 1093 and p = 3511 satisfy (a), 
and neither of these primes satisfies (b). (The only primes p < 23? = 4.3-10° 
satisfying (b) are p = 11 and p = 1,006,003, see [8].) 

It is an amusing consequence of (a) that the first case of Fermat’s last 
theorem holds for exponents that are Mersenne or Fermat primes. 

Several mathematicians proved, with the same hypotheses as in Theorem 
3, that for various other small prime numbers qg one has g?~! = 1 mod p?. 
The best result of this nature, prior to the work of Wiles and Taylor, was 
obtained by Granville and Monagan [5], who covered all prime numbers 
q < 89. If it had been possible to replace 89 by an expression that tends 
sufficiently rapidly to infinity with p, such as 4- (log p)?, then the first case 
of Fermat’s last theorem would have followed for all p, by [7]; but this could 
apparently not be achieved by the method of [5]. However, by a theorem of 
Gunderson (1948) the bound 89 is good enough to imply the first case for 
all p up to the limit in the title of [5]. Tanner and Wagstaff [9] improved 
upon Gunderson’s work and raised the limit to 156,442,236 ,847,241,729. 
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In the proofs, we let p be a prime number, and we let ¢ be a primitive 
pth root of unity in an extension field of Q. We denote by (-) the pth 
power residue symbol for the cyclotomic field Q(¢), and by p = (¢ — 1) 
the unique prime of Q(C) lying over p. The properties of power and norm 
residue symbols that we use can all be found in [2, pp. 348-353]. 

Let it now be supposed that z, y, z are integers not divisible by p 
that satisfy z? + y? + z? = 0. Clearly, p is odd. Removing a greatest 
common divisor, we may assume that z, y, z are pairwise coprime. We 
have [[?-)(x£ +y¢*) = 2? + y? = —z?, and from ged(z, y) = gcd(p, z) = 1 
it follows that the factors x + y¢* are pairwise coprime. Hence each factor 
generates an ideal that is a pth ideal power. 


Lemma 1. Let n be an integer that is coprime to p and z. Then we have 
ce) 


a) = () Rus where the exponent —y/z is computed modulo p. 


Proof. With a = (x + y¢)¢¥/, the assertion reads (2) = 1. Note that (a) 
is a pth ideal power that is coprime to n, so the definition of the power 
residue symbol gives (2) = 1. The general power reciprocity law (see [2, p. 
352, Exercise 2.10]) asserts in this case that (*) (2) equals the p-adic pth 
power norm residue symbol (n,a),. Hence it suffices to prove (n,@)p, = 1. 
We do this by a computation in the ring of integers of the local field at p. 
The units of that ring taken modulo p? are of the form a + b(¢ — 1), where 
a, b € Z/pZ, a #0. They form a group of order (p — 1)p, which is the 
direct product of a group of order p — 1, consisting of the elements with 
b = 0, and a group of order p, consisting of the elements with a = 1; the 
latter group is generated by ¢, since C6 = 14+.b(¢ —1) mod(¢—1)?. A 
general element a + b(€ — 1) is decomposed as a - ¢%/¢, Applying this to 
z+ yC (mod p?), which has a = z + y = —z mod p and b = y, we find that 
the (¢)-component of z + y¢ (mod p?) equals ¢~¥/?. The other component 
must then be (x + y¢)/¢-¥/* = a. Therefore the order of a (mod p?) 
divides p — 1, and a? 1 = 1 — 8 with B € p?. Also, n?—! is of the form 
1—-+, with y € (p) = p?-1. From 67 € p?*! it follows that 1 — By = 6? 
for some non-zero 6 in the p-adic field (cf. [2, p. 353, Exercise 2.12]). 
Using the bimultiplicativity of the norm residue symbol and the fact that 
(1—7,7)p = 1 we find 


(n, O)y = (neo ar?) = (1 ae a —B)p = (1 nef (1 — B)Y)p 4 1, 


the last step because (1 — y) +(1—8)y = 6” (see [2, p. 351, Exercise 2.5]). 
This proves Lemma 1. 


From Lemma 1, we obtain the following result of Furtwangler (1912). 


Lemma 2. We have q?-! = 1modp? for every prime number q that 
satisfies one of the following conditions: 
(i) @q divides x, y, or z; 
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(ii) one of the differences x—y, y—z, z— = is divisible by q but not by p. 


Proof. Suppose first that g is a prime number dividing y. Then g does not 
divide p or z, so we can apply Lemma 1 with n = q to find () = (THe) = 
Cua As C) is a Galois-invariant pth root of unity, it equals 1. Also 
—y/z #0 mod p, so we have (§) =1. The formula ($) = ¢("*-))/P from 
[2, p. 349, Exercise 1.6] now implies g?~! = 1 mod p?. 

Next, suppose that g is a prime number dividing xz — y, and that x — y is 
not divisible by p. Clearly, we may assume that g does. not divide z. From 
the equality (e485) = (HES) it follows, by another application of Lemma 1, 


that go and oO are equal. As —y/z and —z/z are not congruent 
modulo p, this implies (5) = 1. As before, we obtain g?—! = 1 mod p?. 
This proves Lemma 2. 


We derive Theorem 3 from Lemma 2. By the assumption of the theorem, 
there exist x, y, z as above. As one of z, y, z is even, condition (i) holds 
for g = 2. This yields (a). To prove (b), we first note that by (a) we have 
p #3. It suffices to show that one of the conditions in Lemma 2 is met by 
q = 3. If 3 divides one of z, y, z, then (i) holds. Otherwise, the congruence 
xP + y? + z? = 0 mod 3 shows that 3 divides all differences x — y, y — z, 
z—x; but from 32? 4 0 mod p it follows that these differences are not all 
divisible by p, so (ii) holds. This completes the proof of Theorem 3. 

We next prove Theorem 2. Let & be a positive integer for which g = kp+1 
is prime. It suffices to show that if z, y, z are as above, then p divides k or 
q divides N;,. We distinguish two cases. First suppose that one of x, y, z is 
divisible by g. From Lemma 2 it follows that g?-! = 1 mod p?, so we have 
1+kp =q=q? = (1+kp)? =1modp?’. Thus, in this case p divides k. 
Next, suppose that none of z, y, z is divisible by g. From p = (q —1)/k 
we see that each of z?, y?, z?, when taken modulo gq, is a kth root of unity 
in the finite field Z/qZ. Hence there are, in the ring of g-adic integers, 
kth roots of unity ¢«, e7, «0 (say) that are congruent to z?, y?, and z?, 
respectively, modulo gq. From z? + y? +z? = 0 we find 1+7+¥0=0 mod gq, 
so that now g divides N;. This proves Theorem 2. 

Above we saw already that Theorem 1 follows from Theorem 2. 
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REMARKS ON THE HISTORY OF 
FERMAT’S LAST THEOREM 1844 TO 1984 


MICHAEL ROSEN 


Introduction 

It is arguably true that Fermat’s Last Theorem (FLT) has been the most 
famous of all mathematical problems for at least three centuries. There 
has been debate about whether it is a serious and important problem or 
merely a curiosity, but there can be no denying its popularity. Generations 
of mathematicians, both professional and amateur, have tried their hand 
at solving it. These efforts have resulted in a mighty body of theory with 
many deep and important results. Nevertheless, until 1984, when G. Frey 
connected the problem in an intimate way with the arithmetic theory of 
elliptic curves and the conjecture of Taniyama-Shimura-Weil (after earlier 
work in the same direction by Y. Helloguarch), a solution seemed as far 
away as ever. 

We will attempt to review some of the highlights among the results 
obtained in the period from 1844 to 1984 (and a few that came after). 

As is well known the study of Fermat’s last theorem for exponent n 
reduces rapidly to the cases where n = 4 or n is equal to an odd prime p. 
Fermat himself proved the theorem for n = 4 and claimed to have proven 
it for p = 3. By 1844 the theorem was known only for p = 3,5, and 
7 with proofs by L. Euler, L. Dirichlet and A. Legendre, and G. Lamé, 
respectively. The greatest contributions to our subject up to the last few 
years were made by E. Kummer. We begin our discussion with the year 
1844, because that was the year Kummer published his results on the theory 
of ideal numbers in the field Q(¢,). There is some controversy over whether 
Kummer’s primary motivation for creating this theory was his interest in 
higher reciprocity laws or his interest in FLT. He was clearly interested in 
both although he thought the reciprocity laws were more important. In any 
case, three years later in 1847 he completed the proof of his great, theorem: 
FLT is true for regular primes p, i.e., primes p which do not divide the 
class number of Q(¢,). To do this he had to define the class group, prove 
it is finite, analyze it as a product of two factors hp = hths (the real class 
number times the relative class number), and do a profound investigation 
of the unit group. Beyond this he was able to relate the regularity of the 
prime p to divisibility by p of certain Bernoulli numbers. We will review 
this work and give a sketch of his proof of FLT for regular primes. Kummer 
claimed that FLT is true for regular primes p even if we allow entries from 
Q(¢,). Kummer’s proof of this latter claim contains an error, but his error 
was later patched up by Hilbert. It is interesting to ask if FLT is true for 
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all odd primes p if we allow entries from Q(¢,). The modular proof over Z 
does not seem to extend easily to this case. 

Kummer returned to FLT in 1857 when he published a paper dealing 
with certain cases of irregular primes p for which FLT can be proven. 
There are some problems with this paper. The results were put on a firm 
foundation and further refined many years later (1929) by H.S.Vandiver. 
We will mention some of this work which is the foundation for calculations 
which have shown FLT to be true up to very large bounds. This culminated 
in a 1993 paper of J. Buhler-R. Crandall-R. Ernvall-T. Metsankylaé which 
showed FLT to be true for all primes up to four million. This may be of 
limited interest now that we know FLT is true without restriction, but these 
calculations have bearing on other issues as well, for example Vandiver’s 
conjecture that p does not divide hy (already conjectured by Kummer 
many years earlier). The truth or falsity of this is still unknown, but it is 
true for all primes up to four million. 

Irregular primes and Vandiver’s conjecture both relate to the interest- 
ing question of the structure of A,, the p-primary part of the class group of 
Q(¢,). Kummer made a fundamental contribution to this by proving a spe- 
cial case of Stickelberger’s theorem on the prime decomposition of Gauss 
sums. We will discuss the consequences of this, among them the important 
theorem of J. Herbrand (1932). It’s converse was proved by K. Ribet in 
1976. Ribet’s proof used the arithmetic theory of modular forms and Ga- 
lois representation theory and can thus be seen as an early breakthrough 
demonstrating the power of the methods which eventually led to a proof of 
FLT. We will also discuss subsequent work of A. Wiles and Mazur-Wiles 
on the structure of Ap. 

The first case of FLT, i.e., when x? + y? = z? and it is assumed that 
(zyz,p) = 1, will be dealt with elsewhere in this volume [LS]. We will 
concentrate on the second case. With a subject so vast it is inevitable that 
much of value will be left out. We hope that the material to be included 
will be sufficient to give a good sense of the fascinating and intricate history 
of this old conjecture which is finally a theorem. 

Before beginning, we point out that we will assume the reader is fa- 
miliar with elementary algebraic number theory and the elements of p-adic 
numbers. We will not attempt to use the language and notation of the nine- 
teenth century, but will use more or less standard modern notation. What 
is lost in historical flavor by this process is made up for by an increase in 
clarity (or so we hope). 
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Section 1. Fermat’s last theorem for polynomials. 
We begin our discussion, somewhat a-historically, by considering z”-+y”" = 
z™, n > 2, over a polynomial ring A = k[T] where & is a field of charac- 
teristic prime to n. We claim there are no non-zero solutions in A except 
possibly for constant solutions, i.e., when x, y, and z are in k [Gr]. Suppose 
there is a solution z, y,z € A where z,y, and z are non-zero and the max- 
imum of the three degrees is d > 0. We will show that there is a solution 
z',y’, and z’ all non-zero and with the maximum of the three degrees d’ 
satisfying d > d’ > 0. We could then repeat the process indefinitely. Of 
course, this constitutes a contradiction since there are only finitely many 
positive integers less than d. This is Fermat’s method of “infinite descent.” 
Before we begin, note that it is no restriction to assume that k is al- 
gebraically closed. Further, since A is a unique factorization domain, a 
moment’s reflection shows that we may assume that x,y, and z are rela- 
tively prime. Now, consider the identity 


n—-1 
[[@+¢y) =-2". (1) 
i=0 
Here, ¢ is a primitive n’th root of unity in k*. The factorization in 
equation (1) is the basis of all attempts to prove FLT before the elliptic 
curve/modular function approach. Every pair of factors on the left is rela- 
tively prime since if w divided the 7’th and j’th factor it would also divide 
(C7 — ¢*)y and (¢3~* — 1)z and these two polynomials are relatively prime. 
It follows easily that each factor is itself a constant times an n’th power. 
Since we have assumed that k is algebraically closed, each constant is also 
an n’th power. Thus, there are polynomials u, v, and w such that 


gsty=u", r+ dy=u", g+C7y=w". 


Eliminating z and y from these three equations, we find 


w"+Cu™ = (1+ ¢)v”. 


Again using that k is algebraically closed we set z’ = w, y’ = YC u, 
and z’ = Y1+¢v. Then, z’, y’, and z’ constitute a new solution with the 
required properties. In fact, d’ < d/n. 


It may be asked where the assumption n > 2 was used? The point is 
that ifn = 2, then ¢ = —1 and so 1+¢ =0 and z’ = 0. This shows that 
the proof breaks down if n = 2 but is sound for all n > 2. 

Another interesting point is that the proof applies equally well to poly- 
nomials in several variables over a field. 
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Section 2. Kummer’s work on cyclotomic fields. 

The proof of FLT for the polynomial ring A = k[T] is very short and 
sweet. It contains the main ideas of the early attempts to prove Fermat’s 
assertion, but is much easier. Why is that? To begin with Z, the ring of 
rational integers, contains no roots of unity except +1. To compensate for 
this the relevant roots of unity were added to Z and the arithmetic of the 
resulting rings investigated. Letting i = /—1. C.F. Gauss investigated 
the ring Z[i] in his celebrated pair of papers on biquadratic reciprocity 
[G]. Let w = e?"/3. The ring Z[w] was investigated by C.G. Jacobi and 
independently by G. Eisenstein in the course of formulating and proving 
the law of cubic reciprocity. Gauss had investigated the same ring in an 
unpublished paper proving FLT for p = 3. These authors, and others, 
also experimented with other roots of unity. For small primes p it turns 
out that Z[¢,] is a unique factorization domain. In fact, this is true for 
all primes p less than 23. G. Lamé, perhaps led astray by this fact, an- 
nounced in 1847 that he had a proof of FLT. Liouville almost immediately 
pointed out that he was implicitly assuming unique factorization. Soon 
thereafter, Liouville received a letter from Kummer which pointed out 
that not only is the assumption without proof, it is not correct. Kum- 
mer had shown three years earlier that Z[C23] is not a unique factorization 
domain. As we shall see in a moment, Kummer had done much, much more 
than that. 

There is a story, told by K. Hensel in an address given on the hundredth 
anniversary of Kummer’s birth, that Kummer himself had once constructed 
a proof of FLT assuming unique factorization and that the error had been 
pointed out to him by Dirichlet. Although this story has been widely retold 
in subsequent works on number theory, it is probably incorrect. This was 
pointed out, with fairly convincing evidence, by H. Edwards in a pair of 
papers [Ed2,Ed3] which are interesting reading both because of this issue 
and also for a general historical account of the events of 1844 to 1847 which 
bear on FLT. 

Kummer not only noticed the failure of unique factorization, he in- 
vented his theory of ideal numbers to restore this immensely useful prop- 
erty for the rings Z[¢,]. Later, R. Dedekind extended Kummer’s work by 
discussing general rings of algebraic numbers and by reinterpreting Kum- 
mer’s ideal numbers by means of ideals. Because of the fact that Dedekind’s 
language is so much more familiar to modern readers we will use it rather 
than Kummer’s. The interested reader can consult Edward’s book [Ed1] 
for a discussion of Kummer’s point of view. 

For the rings under consideration, Kummer proved that every ideal 
is the product of prime ideals in a unique way. He also defined the usual 
equivalence relation on ideals, proved the equivalence classes form a group, 
and that this group is finite. Let’s call the ideal class group of Z[¢)], 
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Cl,, and its order, hp, the class number. We also refer to hp as the class 
number of the field Q(¢,). Unique factorization holds for elements if and 
only if hp = 1. 

Another complication arises when p > 5. Namely, the unit group will 
contain elements of infinite order. As is well known the units in Z[7] are 
the fourth roots of unity and those in Z[w] are the sixth roots of unity. For 
p > 5 we define the elements (set ¢, = C) 


ck—1 ¢-k-1 __ sin(km/p) 
eee ee en ae sin(1/p) 


4 
Fi for k=2,3,...,25——. (2) 


These elements are easily seen to be units, called cyclotomic units, 
in Z[¢,]. Kunimer shows that they are independent; ie., they generate a 
free abelian group of rank BS. Let Cp be the subgroup at the unit group 
Ep generated by the &;, and’ the roots of unity. C, is called the group of 
cyclotomic units. Kummer shows that Cz is of firite index in EK, and gives 
the following beautiful interpretation of the index. 


Theorem 2.1. The index [Ep : Cp] = h}, where h{ is the class number 
of Q(G + G') = Q(¢p)*, the maximal real subfield of Q(G).- 


Further, it turns out that hf divides hp, so that we can define the 
integer hy = hp/h*f, called the relative class number. In the older literature 
hp and het are refered to as the first and second factors of the class number 
respectively. Kummer also gives a beautiful and important formula for hp 
which we shall now proceed to explain. 

Let x be a Dirichlet character modulo p. We say that yx is odd if 
x(-1) = —1 and even if x(—1) = 1. Define By, = p71 1?73 x(a)a. It is 
easy to see that By, = 0 if x is even and not trivial. On the other hand: 


- 1 
ho = 2p II (-3) By: 


x odd 


Theorem 2.2. 


The numbers B, , are sometimes called generalized Bernoulli numbers. 
We will relate them to ordinary Bernoulli numbers a little later. First, we 
mention another of Kummer’s important theorems about the class number. 


Theorem 2.3. If p divides h}, then p divides h; 


Definition. A prime number p is called regular if p does not divide hp, 
otherwise it is called zrregular. 

As was mentioned in the introduction, Kummer’s greatest contribution 
to FLT was to show that it is true for regular primes. We will sketch a proof 
of this in Section 3. For the remainder of this section we discuss Kummer’s 
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criterion for regularity which allows one to actually compute whether a 

given prime is regular or not. We begin by recalling the definition of the 

Bernoulli numbers and some of their properties (see [BS], [IR], [Ri], or [W]). 
The simplest way to define the Bernoulli numbers B,, is by way of the 


power series expansion 
t Sd 
t = >, Bn—: 
e’ —1 ni 
n=0 


From this it is easy to derive the formula 


(m+1)Bm = 2s es Be. 


k=0 


The Bernoulli numbers may now be computed recursively. One finds 
By = —3,B, = +, Bs = 0, Ba = —g, Bs = 0, Bs = ron etc. For n > 1 and 
odd one has B, = 0. The even numbered Bernoulli numbers grow quite 
rapidly. In fact, |Bom| > 2(m/7e)?™. 

The Bernoulli numbers are rational numbers whose denominators are 
known thanks to the theorem of Von-Staudt and Claussen. This asserts 
that the denominator of B, is the product of those primes p such that 
p —1 divides n. In particular, if p > 3 the numbers Bo, Bu, Bg,..., Bp_3 
are p-integral. 

Let Z, denote the p-adic numbers. It is well known that the unit 
group Z* contains the (p—1)** roots of unity, and that there is a character 
w:(Z/pZ)* — ZZ with the property that w(a) = a (mod p) for all rational 
integers a prime to p. w is an odd character of order p—1. Using Theorem 
2.2, one easily derives the following p-adic version. 


Theorem 2.2 P. 


The equality here takes place inside Zp. 

To get Kummer’s criterion from this we need the following congru- 
ence which is proved in [W, Cor.5.15] using p-adic L-functions and in (Lgl, 
Theorem 2.5] using the theory of p-adic distributions. Because of its im- 
portance we give another, more elementary proof, in the appendix to this 
paper. 

Proposition 2.3. If n is odd, and p—1 does not dividen +1, we have 


B, 
Bian = —* (mod p). 
n 


We are now in a position to demonstrate the following wonderful result 
of Kummer. 
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Theorem 2.4. A prime number p is regular if and only if it does not 


divide the numerator of any of the Bernoulli numbers Bo, Ba,..., Bp_3. 


PROOF. By Theorem 2.3 it follows that p divides h, if and only if p divides 
ho . We will use the expression for h, given by Theorem 2.2 P. 
First consider the case 1 = p — 2. We have, 


p-1 p-1 
DB, yp-2 = Sow? (a)a = ya =p—1=-1(modp). 
a=1 a=] 
Thus, 
p—4 p—4 
mn 1 1 Biz 
h= —-~B,.})= —= dp). 
i odd i odd 


The result follows directly from this congruence. 


By using this theorem it is possible to check a given prime for regu- 
larity. Kummer showed that the only irregular primes less than 100 are 
37,59, and 67. He later checked all the primes up to 164 and found that 
101, 103, 131, 149, and 157 are the only additional irregular primes. At one 
point he thought that he had a proof that there are infinitely many reg- 
ular primes. In 1915, K.L. Jensen proved that there are infinitely many 
irregular primes [Ri; Lecture VI, Section 4]. However, to this day it is not 
known if there are infinitely many regular primes. It is possible to give 
a probabilistic argument to show that over 60% of the primes should be 
regular. 

Let’s assume that the probability that an even indexed Bernoulli num- 
ber Bom be divisible by p is 1. If this is so, the probability that none of 
the numbers Bo, Ba,..., Bp—3 be divisible by p is 


1\ @-3)/2 
(1 = ~) we /2 = 6065 . 


This estimate for the percentage of regular primes agrees very well 
with the experimental evidence [BCEM]. It would be nice to have a rigorous 
proof. 

We end this section with two more of Kummer’s results. These two 
concern units. The first is fairly easy, but very useful. The second is quite 
deep, and is crucial to Kummezx’s proof of FLT for regular primes. 

Recall some simple facts about Q(¢,). The prime p is totally ramified 
in this field and the prime lying above (p) in Z[G,] is (A), where A = G —1. 
Note that ¢, = 1 (mod 4) and Z[G]/(A) = Z/pZ. 
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Proposition 2.5. Let u be a unit in Z[{¢,|. Then u can be written in a 
unique way as aCe, where 1 is determined modulo p, and e is real and 
positive. 


ProoF. Let bar denote, as usual, complex conjugation. Then, u/dt is a unit 
such that it and all its conjugates have absolute value 1. By a well known 
result, u/& is a root of unity. By ramification theoretic considerations the 
only roots of unity in Z[¢,| are +¢5 for O <i<p—1. If u/ai =, find an 
integer 7 such that 27 = 7 (mod p). Then one easily checks Ce Jy is equal to 
its own conjugate, i.e., is real. The result follows in this case by adjusting 
the sign. 


Suppose u/u = —¢*. We will show this leads to.a contradiction. Choos- 
ing 7 as above, and setting w = (7 4u, we find that w is a unit such that 
® = —w. Thus w* = —ww = —v. It follows that —v € Z[¢,|* is a negative 


unit and so w generates the extension Q(¢,)/Q(¢,)*. It would follow that 
this extension is unramified at p (recall p # 2). However, it is ramified at 
p, so we have reached a contradiction. 


Corollary. Let Et be the real positive units in E, and C} be the sub- 
group of C, generated by the cyclotomic units (see equation (2) above). 
Then, Et /Cf & E,/Cp and both groups have order hj. 


PROOF. This follows directly from Theorem 2.1 and Proposition 2.5. 


We remark that there is another way to finish the proof of the propo- 
sition. Suppose w is a unit such that w = —w. By the remarks preceding 
the proposition, there is a rational integer M such that w = M (mod 4). 
It follows that M = —M (mod 4) and so 2M = 0 (mod A). Thus, A divides 
M and so also w. However, w is a unit so this is a contradiction. 

The final result we need is known as Kummer’s lemma. It is simple to 
state. Any unit in Q(¢,) which is congruent to a rational integer modulo p 
is a p** power. The usual proof is by means of a close analysis of the unit 
group and especially of the local unit group in the completion of Q(¢,) at 
the prime (A). This involves the p-adic logarithm, the local expansion of the 
p-adic logarithm of the cyclotomic units, etc. The analysis is nicely carried 
out in Section 6 of Chapter 5 in [B-S]. We give an alternative approach, 
suggested by Hilbert in his Zahlbericht [H], which depends on the study of 
ramification in Kummer extensions. This approach brings out more clearly 
the underlying role played by class field theory. 


Lemma 2.6. Let G € Z[¢,|. Suppose does not divide @ and that z? = 
G (mod A?) is solvable. Let K be the field obtained by adjoining A = %/B 
to Q(¢,). Then the extension K/Q(¢,) is unramified at p. 


PRooF. Suppose a? = @ (mod 4?) and set 7 = (A—a)/A. Then is a 
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root of the monic polynomial 


(At +a)? ~ 6 
f(a) =’ 

Using the fact that p = uA?—! where wu is a unit, we see that all the 
coefficients of f(x) are algebraic integers. Thus, 7 is an algebraic integer. A 
short computation shows that all the coefficients of f’(z) are in (A) except 
the constant term ua? which is prime to (A). Thus, f’(r) is prime to (A) 
and it follows that the relative discriminant is prime to (A). The proof is 
complete. 


Theorem 2.7. Let p be a regular prime and e a unit in Q(G,) which is 
congruent to a p** power modulo \?. Then, e is the p** power of a unit. 


ProoF. Consider the extension L = Q(G, ¥/e) of Q(¢,). This extension 
is cyclic of degree 1 or p. Suppose the degree is p. By Lemma 2.6, the 
extension is unramified at p. Since e is a unit it is easy to see it is unramified 
at every other prime as well. Thus, it is an unramified, abelian extension of 
degree p and it follows by class field theory that p|h,, which contradicts the 
assumption that p is regular. Thus, ~/e € Q(¢,) and the theorem follows. 


The phrase “it follows by class field theory” can be avoided. The result 
needed follows from Theorem 94 of Hilbert’s Zahlbericht [H; page 155]. We 
sketch a short cohomological proof of this in the Appendix. 

The usual form of Kummer’s Lemma can be deduced from Theorem 
2.7 as follows. If e = a (mod p) with a € Z we have e = a+ pa with 
a € Z,. Now, a = b (mod 4) with b € Z. Thus, e = a+ bp (mod »?). Let 
o be an element of Gal(Q(¢,)/Q). Then, e?/e = 1 (mod 4?) and is thus 
a p'® power by Theorem 2.7: e? = en(o)?. Taking the product over all 7 
and remembering that the norm of a unit is +1, we find +1 = e? 17? for 
a suitable 7 and finally, e = (+e7)?. 


Section 3. Fermat’s last theorem for regular primes and certain 
other cases. 

Having assembled a number of the powerful tools forged by Kummer, we 
will now give part of his proof that FLT is true for regular primes. Here is 
the statement of the full theorem. 


Theorem 3.1. Let p be a regular prime. Then, the equation xz? + y? = zP 
has no solution with z, y,z € Z[¢,| and zyz £0. 


The proof is usually broken up into two parts. The first case is when A 
does not divide xyz and the second case is when A does divide zyz (recall, 
A = @ —1). The first case is easier. If one confines one’s attention to Z 
rather than Z[¢,|, proofs of the first case can be found in [BS,IR,W] and 
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many other places. For a proof of the first case in Z[G,] see [La] or [H]. We 
will concentrate on the second case. 

It is interesting that Kummer made a simple error at the beginning of 
his proof of the second case. He asserts that it is no loss of generality to 
assume that 2, y, z are pairwise relatively prime. This is certainly true over 
Z, but is false over Z[¢,| because there may be a common divisor which 
is not principal. Once this is realized it is not hard to alter Kummer’s 
proof so that it holds in full generality. Hilbert does so in Section 172 
of [H]. See [La] for another presentation. We give Kummer’s proof of the 
more restricted result because it is relatively short and the main ideas show 
through more clearly. 


Theorem 3.1’. Let p be a regular prime. Then, the equation 


has no solution with x,y,z € Z[¢,|, Alzyz, and x,y,z pairwise relatively 
prime. 


Proor. Assume z,y,z is such a solution. It is no loss of generality to 
assume that A|z. It follows that A does not divide x or y. Write z = A™z 
with (A, z9) = 1. Then, z,y, 29 is a solution to X? + Y? = A™PZ?P with 
(xyz, A) = 1, 2, y, z pairwise relatively prime, and m > 1. Let u be a unit 
in Z[¢,| . We will show that there are no solutions z, y, z to 


XP4Y? = ur? ZP (x) 


with x,y,z pairwise relatively prime, (ryz,A) = 1, and m > 1. This will 
prove the theorem. 

The strategy is this. Assume such a solution exists. One shows that, 
in fact, m must be greater than 1. Then one finds a solution of the same 
type to a similar equation but with m replaced by m— 1. This yields a 
contradiction via “infinite descent.” 

We need a Lemma. 


Lemma 3.2. Let vu € Z[¢,| with (v,A) = 1. Then, there is a rational 
integer k such that kv = a (mod A) with a € Z. 


PRooF. Since the residue class field modulo 4 has p elements, we may 
write v = m+n, (mod 4”) where m,n € Z and (m,p) = 1. Now, ¢F = 
(1+A)* =1+<A (mod 4”). Thus, Fv = m+(n+km)A (mod 4”). Choose 
k to be a solution of n+ mz = 0 (mod p) and the Lemma follows. 


Now, assume 2, y, z is a solution of equation (*) above with (xyz, A) = 
1 and z, y, z pairwise relatively prime. By Lemma 3.2 we can assume that 
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z,y = a,b (mod 4?) where a,b € Z. We have, 


p-1 
[[c¢ + Cry) =a Art Zz. (x) 


7=0 


It follows that A must divide at least one term on the left hand side of this 
equation, and consequently it must divide all the terms. Since r+ y = 
a+b (mod +7), we must have p| a+b andso \? | 2+ y. From equation 
(«x) it now follows that A?! divides the left hand side, and so m > 1, 
which is our first goal. 

We have r+ Gy = x+y+(G5—1)y and so fori > 0, c+ Cty is exactly 
divisible by 4. Thus, passing to ideals, we find (x+y) = (A)?(™—-)4! C4 and 
(c+ Gy) = (A)C; fori = 1,2,...,p—1, where the C; fori =0,1,...,p—1 
are pairwise relatively prime principal ideals each of which is prime to (A). 
From equation (**) we deduce CoC) ...Cp_1 = (z)?. It follows that each 
C; is a p'® power, i.e, C; = D?. Since D? is a principal ideal and p is a 
regular prime, we deduce that D; is a principal ideal, ie., D; = (w,) for 
w; € Z[¢,]. Returning to the level of elements, we see that there are units 
Uo, 1, U2 such that 


Ey = UAV lyP og t+ Ay = uyAwi, c+ y= udu. 


Eliminating x and y from these three equations and dividing the result by 
ui, we find units e2 and e such that 


we + eqws = ede UP? 


Taking congruences modulo »?, we see that eg is congruent to a p™ 
power modulo \?. By Theorem 2.7, e = f? for some f € Z[G,|. Setting 
zg = wi, y’ = fwe, and z’ = wo, we see that z’,y’,2z’ is a solution to 
XP + Y? = eX™-UPZP for which (x’y’z’, \) = 1 and 2’, y’, z’ are pairwise 
relatively prime. We have reached our second goal and so completed the 
proof of Theorem 3.1’. 


Kummer went well beyond the case of regular primes in his attempt 
to prove FLT in general. He produced some explicit cyclotomic units EF; 
for i = 2,4,...,p—3 and stated that FLT is true if the following conditions 
hold; h; is divisible by p but not p*, and fori = 2,4,...,p —3, B,; is not 
divisible by p* and FE; is not a p** power. He then verified on the basis of 
this result and Theorem 3.1 that FLT holds for all primes less than 100. 

Kummer’s work was reconsidered by H.S. Vandiver in the 1920’s. He 
found some problems with the proof of the above mentioned result which 
he was able to fix. He went on to improve upon Kummer’s work in several 
ways. For example, he proved the following theorem [V], [W; Theorem 9.4]. 
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Theorem 3.3. Suppose By; is not divisible by p® for i = 2,4,...,p—3 
and that he is not divisible by p. Then, the second case of F'LT is true 
for p. 


In Chapter 9 of [W] there are further results of Vandiver which give 
rational criteria for proving the second case of FLT. See, in particular, The- 
orem 9.5. The first case of FLT can also be tested by even simpler rational 
criteria (see [LS]). Thus, it became possible to test FLT computationally. 
Vandiver and his students, using desk calculators, extended Kummer’s ver- 
ification to all primes less than 620. In 1954, using a computer, Vandiver, 
D.H. Lehmer, and E. Lehmer verified FLT for all primes up to 2,000. In 
1955, J.L. Selfridge, C.A. Nicol, and H.S. Vandiver verified FLT up to 
4,001. In 1976, S.Wagstaff showed FLT is true for all p < 125,000. In 1993 
it was shown that FLT is true for all primes up to 4 million [BCEM]. This 
is the largest bound achieved before FLT was proven to hold for all primes 
p and so alln > 2. 

Vandiver (1882-1973) made many contributions to FLT and other 
parts of number theory. For references to some of his work on FLT see 
his interesting expository article (and the short follow up article) [V]. Ac- 
cording to an interesting obituary notice written by D.H. Lehmer [Lmr], 
Vandiver never graduated from high school. He spent most of his profes- 
sional life at the University of Texas. He is the only American mentioned 
in E. Landau’s monumental three volume treatise on number theory, Vor- 
lesungen tiber Zahlentheorie. 

Vandiver conjectured that hf is not divisible by p for all primes p. 
This conjecture is referred to simply as Vandiver’s conjecture. Serge Lang 
has pointed out that Kummer made the conjecture many years earlier in 
a letter to Kronecker where he refers to it as a “noch zu beweisenden 
Satz” (Kummer’s Collected Works, vol. 1, page 85). In any case, it has 
held up well. In [BCEM] it is verified for all primes less than 4 million. 
Larry Washington has produced a probabilistic argument [W, page 159] 
which shows that the number of exceptions to Vandiver’s conjecture up to 
a given bound z should be approximately $loglogz. For x = 4,000,000 
this is approximately 1.361, so the fact that no counter-example has shown 
up is perhaps not surprising. On the other hand Washington’s reasoning 
rests on certain randomness assumptions which may or may not hold. In 
any case, the conjecture is true very often! This is good because, as we 
shall see in the next section, Vandiver’s conjecture has very interesting 
implications. 


THE HISTORY OF FERMAT’S LAST THEOREM 517 


Section 4. The structure of the p-class group. 

A prime is irregular if A,, the p-part of the class group of Q(G,), is non- 
trivial. Much work has been devoted to understanding the structure of A, 
beginning with Kummer. Important contributions have been made by peo- 
ple like Hilbert, Herbrand, Leopoldt, [wasawa, Ribet, Mazur, and Wiles, 
among others. In this section we will review some of this work. Among 
other things we will show that if one accepts the Vandiver conjecture as 
true, then it is possible to give a completely satisfying description of the 
structure of Ap. 

We begin with a few preliminary remarks. It is convenient to write 
the group operation in A, additively. Also, since p is fixed in this dis- 
cussion, we will write A instead of Ap. Since Ais a torsion p-group, we 
may consider it as a Z,-module. It is also a module for the Galois group 
Gp = Gal(Q(¢,)/Q). Thus, A is a module over the group ring Z,[G>]. 
As is well known, (Z/pZ)* is isomorphic to Gp, where the automorphism 
corresponding to a, 0a, takes ¢, to (. Let w be the p-adic valued character 
introduced in Section 2, and for each i such that 0 <2 < p—1 define 


1 i 
€ = not Datta) Oa E Z([G,|. 


Here as elsewhere in this section, the summation goes from a = 0 to 
a = p— 2. It is easy to check that these elements constitute a complete set 
of mutually orthogonal idempotents in the group ring. Define A; = ¢;A. 
Then, 


A= B— A;, andif z € A,, then ogz =w(a)‘z. 


Because of this decomposition, to understand the structure of A, it suffices 
to understand the structure of each A;. 

The automorphism o_ is simply complex conjugation. Since w(—1) = 
—1, it follows that complex conjugation fixes A; if 7 is even, and acts as 
multiplication by —1 if 7 is odd. It is not hard to show that the part of A 
which is fixed by complex conjugation is isomorphic to the p-part of the 
class group of Q(¢,)*. Thus, if Vandiver’s conjecture is true, A; = (0) for 
2 even. 

A final preliminary comment is that €9 is a constant times the norm 
map, and the norm map annihilates the class group. It follows that e9 
simultaneously annihilates and fixes Ag. The conclusion is that Ag = (0). 

Do any other elements of the group ring Z[G,] annihilate A? This 
question is the key to the deeper part of the theory. The answer is that 
yes, there are other elements beside the norm map which annihilate the 
class group. The honor of being the first to see this goes (once again) to 
Kummer. Let | be a rational prime such that | = 1 (mod p), £ a prime 
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lying above it in Q(¢,), and let x be an appropriately chosen character of 
(Z/lZ)* of order p. Form the Gauss sum 


I-1 
G(x) = S$) x(2)¢F. 
xz=0 


Then, it is easy to show G(x)? € Q(¢,). What Kummer did (his 
notation was different) was to determine the prime decomposition of this 
element. He showed, 

(GQ)?) = Le". 

Kummer also showed that in every ideal class there are ideals all of whose 
prime factors have absolute degree 1. It follows from these considerations 
that }>ao7' annihilates the class group of Q(¢,). Somewhat later, 1890 
to be precise, L. Stickelberger generalized all these considerations to the 
fields Q(¢,,) with m arbitrary. Complete proofs of Stickelberger’s theorem 
can be found in [IR] and [W]. We will deal only with the case m = p. Let 
us state the result we will need. 


Theorem 4.1. Define 0 = eee, 1 Then, p@ annihilates the class 
group. Further, suppose (b,p) = 1. Then, (a, — )@ € Z,[G,] and (a, — b)0 
annihilates the class group. 

Corollary. Ag = (0) and A; = (0). Fort > 2, By,,-. annihilates A,. 
(Note that this is only interesting for 7 odd, since the generalized Bernoulli 
numbers are zero for even characters). 


PRooF. We have already shown that Ag = (0). Now, p@ acts on A, by mul- 
tiplication by > aw(a)—1 = S° aw(a)?-? = So aP-1 = p—1= —1 (mod p). 
This shows that p@ acts on A; by multiplication by a p-adic unit. On the 
other hand, it annihilates A; by the theorem. Thus, A; = (0). 

Assume now that 7 > 2. Then, (a, — 6)@ acts on A; by multiplication 
by (w(b)* — b) By ,-:. Choose b to be a primitive root modulo p. Then, 


w(b)' —b = b(b** — 1) £0 (mod p). 


This factor is therefore a p-adic unit and so B,,,-. annihilates A; as as- 
serted. (It is easy to check using the defining property of w that By .,-. € Zp 
for z > 1.) 

We can now prove the following important theorem of J. Herbrand 
which appeared (posthumously) in 1932 [He]. 


Theorem 4.2. Let 1 be odd and3<1<p-—2. Set 7 = p—1. If p does 
not divide B;, then A; = (0). 


PROOF. By the above corollary, B,,-. annihilates A;. By Proposition 2.3, 


B _3 
B —: B p-l-31 = ee d . 
lw lw p—i (mo D) 
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Thus, if p doesn’t divide B,_; = Bj, it follows that By .,-: is a p-adic unit 
and so A; = (0). 

Herbrand also proved the converse to this theorem, but under the 
assumption that Vandiver’s conjecture is true. In 1976, Ribet proved the 
converse is true unconditionally [R]. 


Theorem 4.3. Lett be odd and1<i<p—2. Set 7 =p-—i. Ifp divides 
B;, then A; Pe 0. 


Ribet’s methods are completely different from those used by Herbrand, 
which were classical, descending in a direct line from Kummer. Ribet 
used the rapidly developing arithmetic theory of modular forms and Galois 
representation theory, the same tools used eventually to successfully attack 
FLT itself. 

Herbrand’s theorem and its converse are somewhat qualitative in na- 
ture. They provide a rational criterion for determining when 4A, is trivial 
or not. In 1980, Wiles proved the following, more quantitative, result [Wij]. 


Theorem 4.4. Let i be odd and3<1i< p—2. Assume that A; is cyclic. 
Then ##(A;) = p™ , where m; = ordp(By,-:). 


Once again, the methods used were the modern ones involving modular 
forms, modular curves, and representation theory. The result is very nice, 
but is conditional on the hypothesis of cyclicity. This was removed a few 
years later in 1984 when B. Mazur and A. Wiles published a proof of the 
main theorem of Iwasawa theory for abelian extensions of Q. See [MW]. As 
a corollary of their work, they deduce the following unconditional result. 


Theorem 4.5. Let i be odd and3 <i < p—2. Then #(A;) = p™, where 
mj; = ord, (By y-:). 


This result makes no assertion as to whether A; is cyclic or not. It 
gives its size, but not its structure. 

The Mazur-Wiles result once again uses sophisticated modern methods 
involving the theory of modular forms, modular curves, and modular Ja- 
cobians. A different approach was found by V.A. Kolyvagin, who invented 
the method of Euler systems which is in principal much more elementary. 
The application of these methods to [wasawa theory is exposited in an ap- 
pendix by K. Rubin to the new edition of Lang’s book on cyclotomic fields 
[Lgl]. 

We will end this section by showing how “classical” methods together 
with the conjecture of Vandiver enables one to give a complete structure 
theorem for A. In essence, this is due to Herbrand. However, we will 
make good use of the “Spiegelungsatz” which is due to H.W. Leopoldt [Le] 
(1958). Special cases were proved independently by Iwasawa. We state 
such a special case next. The proof involves Kummer theory and class field 
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theory. See Section 10.2 of [W]. If B is a finite abelian p-group, define 
rank B to be the dimension over Z/pZ of B/pB. This can be shown to be 
the minimal number of generators for B. 


Theorem 4.6. Suppose j is even, 1 <i,7 <p—1, andi+j=p. Then, 
rank A; < rank A; < rank A; +1. 


Corollary. Assuming the Vandiver conjecture, Aj; = (0) for j even and 
A; is cyclic for i odd. 


It should also be remarked that the first inequality of the Theorem can 
be viewed as a-refinement of Kummer’s result, Theorem 2.3, that if plhz ;: 
then pik, . 

Define 


At={zeéAlo_yc=z} and A- = {re Al o_yx = —-2}. 


Then A= At + A™ and At = (0) if Vandiver’s conjecture holds. 


Theorem 4.7. Assume Vandiver’s conjecture is true. Then, 
A=A™~=A3@As5@...@ Ap_2. 
Fort odd, 3 <i< p— 2, set m; = ord,B,,,,-.. Then, 
Aj ~ Zp/Byy-Zp ~ Z/p™Z. 


Finally, m; > 0 if and only if p|Bp_;. 


PROOF. Vandiver’s conjecture implies At = (0), and so A= A. By 
the Corollary to Theorem 4.1, A, = (0), so the first assertion follows. 
By the same Corollary, B,,,-: annihilates A;, so A; is a Zp/B,.,-.Zp ~ 
Z/p™Z module. By the Corollary to Theorem 4.6 we have that A; is 
cyclic. It follows that #(A;) < p™. Taking the product over all i, we 
find #(A~) < p¥™. We claim that this inequality is actually an equality. 
Consider the class number formula given in Theorem 2.2 P, and recall that 
pB,,,-1 is a p-adic unit. It follows that ord,(h>) = mg +ms5 +...-+Mp-2. 
Since ord,#(A~) = ordp(h> ), we have proven our claim. This shows that 
#:(A;) =p™ for all 7. Since A; is a cyclic Z/p™Z module, the conclusion 
is that A; ~ Z/p™Z. 
The final assertion follows from the congruence 


By» = Bp-i/(p— 4) (mod p). 


The index of irregularity of a prime p is defined to be the number of 
Bernoulli numbers in the set {Bo, Ba,...,B ,-3} which are divisible by p. 
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Call this number r,. The above theorem shows that when Vandiver’s con- 
jecture is true, the rank of A, is equal to rz. By [BCEM] we know that 
Vandiver’s conjecture is true for all primes up to 4 million. They also are 
able to verify that in this range, Ap is an elementary p-group. Thus, 


Ap ~ (Z/pZ)"* 


for all primes less than 4 million. They also show a similar result for the 
p-part of the class group of Q(¢,~). 

It is interesting to ask about the properties of the invariant rp. For ex- 
ample, can it be arbitrarily large? If one assumes that the probability that 
a given ‘Bernoulli number is divisible by p is 1/p, the probability (assuming 
some independencies) that rp = k is given by 


= pos 1 
(E)() O-5) = ae 
k Dp D 2*k! 
(The left hand side approaches the right hand side as p — oo.) The cal- 
culations of [B-C-E-M] show that this is in excellent agreement with the 
facts for k = 0,1,2,3,4,5 for the set of prime less than 4 million. The 
largest value of rp found was 7 which occurs only once for p = 3,238,481. If 
one accepts the validity of the probabilistic calculation just given, it would 


follow that rp can be arbitrarily large, but no proof of this is known. It is 
also unknown if the exponent of the group A, can be arbitrarily large. 


Section 5. Suggested readings. 
In this brief final section we will point to a few sources for further reading. 

Since so much of our story has concerned Kummer, it is perhaps most 
suitable to begin with the first volume of his collected works [Ku]. The 
first volume contains his contributions to number theory. The introductory 
essay by A. Weil is very enlightening, both for its assessment of the signifi- 
cance of Kummer’s contributions and as a guide to the papers themselves. 
These volumes were first published in 1975, and there are two informative 
reviews to recommend, one by Edwards [Fd4] and the other by B. Mazur 
[M]. Mazur’s review is especially valuable for connecting Kummer’s work 
to modern developments, much of which is due to Mazur himself. 

As for FLT itself, there is the interesting, but mostly non-mathemati- 
cal, book by E.T. Bell [B]. More substantial is the book by Edwards [Ed1]. 
In spite of its title, the emphasis is not so much FLT as the development 
of algebraic number theory in the hands of Kummer. In particular, it 
elaborates Kummer’s theory of ideal numbers in much greater detail than 
any other work available in English. Also by Edwards are the interesting 
papers [Ed2] and [Ed3], already mentioned earlier, which debunk the story 
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that Kummer thought he had proven FLT but had mistakenly assumed 
unique factorization in Z[¢,]. The best source for covering the spectrum 
of work done on FLT and related topics up through 1978 is the book of 
Ribenboim [Ri]. It is erudite and readable at the same time. Moreover, 
the bibliography is very extensive and useful. 

The area of p-adic analysis is a modern development with origins in 
Kummer’s arithmetic work. A good introduction is the book of N. Koblitz 
[Ko]. More advanced, and more inclusive in its coverage, is the book of 
Lang [Lgl]. As has been pointed out earlier, the new edition of this book 
contains a valuable appendix by Rubin which gives a proof of the main 
theorem of Iwasawa theory using the methods of Kolyvagin. 

For the purposes of finding the proofs of most of the facts related 
in this paper, the best reference is the book by L. Washington [W]. A 
second edition of this book is in production and it will contain a number of 
interesting new results , e.g., a proof of the converse to Herbrand’s theorem 
which uses Kolyvagin’s Euler systems and avoids the more sophisticated 
methods used by Ribet. Of course, it should not be forgotten that these 
more sophisticated methods are what underlie the magnificent proof found 
by Wiles of FLT, a proof that eluded mathematicians for hundreds of years! 


Appendix A. Kummer congruence and Hilbert’s theorem 94. 
We will sketch the proof of two important results mentioned in the text. 
The first is a congruence which, in essence, is due to Kummer. The second 
is Hilbert’s theorem 94, which is not as famous as his theorem 90, but 
nevertheless of great importance in the history of class field theory. Our 
proof will use some properties of cohomology of groups. 


Theorem Al. Let p be an odd prime,n >0 an odd integer, and assume 
that p—1 does not dividen+1. Then, 


Bn+i 
B wr = = d : 
1, asl (mod p) 
ProorF. It is well known and easily proved that w(a) = limp.o, a”, where 
ais any p-adic integer and the limit is taken p-adically. It follows that for 
k sufficiently large, 
p—l 
k 
PB wm = Sate = PBiinpk (mod p)- (*) 
a=1 
The last congruence follows from the Corollary to Proposition 15.2.2 of [IR] 
and the observation that 1+ np* = 1+n (mod p—1). Then Kummer’s 
congruences [[R; Thm.5, Ch 15] imply 


By np* = Basi 
l+np*~ n+l 


(mod p). («*) 
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Dividing congruence («) by p and using congruence («*) yields the result. 


The second result we want to prove is Hilbert’s Theorem 94. We 
actually prove a slightly more general result which, given the cohomological 
tools we shall use, is no harder to prove (For further results using this type 
of analysis, see [CR]). 


Theorem A2. Let L/K be acyclic unramified extension of algebraic num- 
ber fields of odd degree n. Then the order of the kernel of the natural map 
from Clx to Cl, is divisible by n. In particular, the class number of K is 
divisible by n. (Here, Clx and Clz, denote the class groups of K and L 
respectively.) 


Proor. Let Dr, Pr, and Uy, denote the divisors, principal divisors, and 
units of L respectively, with similar notation for the field K. Let G = 
Gal(L/K). Consider the two exact sequences, 


(0) + Ur — L* > Py = (0), (1) 
(0) — Pp — D, — Clz — (0). (2) 


Using Hilbert’s Theorem 90 and equation (1), we find H1(G,Uz) = 
P /im(Px). From equation (2) we derive 


(0) — P£/im(Px) — D¢ /im(Px) > CI. 


Since L/K is unramified, Df = im(Dx). Combining these remarks we 
derive the following exact sequence, 


(0) — H'(G,UzL) > Clx > Cl§. (3) 
For any G module M, the Herbrand quotient of M is defined to be 


#H(G,M) 
(M) = “EHG,M) 

Given the hypotheses of the theorem it is well known that h(Uz) =n“. 
For the proof, which depends on an analysis of the structure of Ur; ® Q as 
a Q[G]-module, see [Lg], [A-T], or the article of J. Tate in [C-F]. It follows 
immediately that n divides the order of H!(G,Uz,), which by equation (3) 
is isomorphic to the kernel of the natural map from Clx to Clz. This 
completes the proof. 


If n is even, it is still possible to derive results of a similar nature by 
making more restrictive hypotheses. For example, as Hilbert remarks, if 
nm = 2 the result remains valid if we assume that every real prime of K 
splits in L. 
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ON TERNARY EQUATIONS OF FERMAT TYPE 
AND RELATIONS WITH ELLIPTIC CURVES 


GERHARD FREY 


§1. CONJECTURES 


The main purpose of this chapter is to show how arithmetical prop- 
erties of elliptic curves E defined over global fields K and corresponding 
Galois representations are often related to interesting diophantine ques- 
tions, amongst which the most prominent is without doubt Fermat’s Last 
Theorem, which has now become Wiles’ theorem. 

Of course, the most important case for us is when K is a number field, or 
even equal to Q, but many of the conjectured (or proved) assertions make 
sense for fields of finite type and become more convincing in the case that 
K is a function field or, in other words, for families of elliptic curves. The 
reason for this is that the conjectures predict the behaviour of “generic” 
elliptic curves and it is expected that each “special” property gives rise to 
interesting diophantine properties of K. 

So assume from now on that K is a global field, i.e., that K is either a 
finite number field or a function field of one variable over a perfect field Ko. 
For simplicity we always assume that char(Ko) 4 2,3. By =x we denote 
the set of non-archimedean places of K. For divisors D € Z[Xx] we define 
deg(D) as the divisor degree in the usual sense if K is a function field, 
and as the logarithm of its norm in the number field case. For a subset 
So C x, let Os, be the ring of So-integers; its units are denoted by O%,, 
and called So-units. 

The field K has two important numerical invariants. By g() we denote 
the genus of K, which is the usual genus in the function field case, and equal 
to §log|Ax,g| in the number field case (where Ax g is the discriminant 
of K/Q). By d(K) we denote the degree of K, which is equal to [K : Q] if 
K is a number field. If K is a function field, d(K) is equal to 


in d ee in |K:K ; 
feK\ Ko ee(/) jenna (7) 


where (f)oo is the polar divisor of f; so d(K) is the minimal degree of a 
covering map from the curve corresponding to K to Pi Ko° 
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(a) Conjectures about diophantine equations. 

One of the most basic conjectures about solutions of ternary diophantine 
equations comes a little bit disguised. For x € K*, let hx(x) denote its 
height, and 


supp(x)= |] p 


Up (x) 40 
pele 


its divisor of support. 


Conjecture 1. There are constants c and d such that the following holds. 
Assume that x # 0,1, and that K/Ko(x) is separable if K is a function 
field. “Then 


hx (x) < c- degsupp(x(x — 1)) +d. 


Of course here and in all of the conjectures which will follow, it is of 
great importance to specify how the constants depend on K. A refinement 
of Conjecture 1 is 


Conjecture 1’. For alle € Ryo, one can take c= 1+ € and d= d(e, g(K)) 
in Conjecture 1. Further, the dependence of d on g(K) is linear, and for 
K a function field, « = 0 is allowed. 


In section 2 we shall see that Conjecture 1’ is true for function fields. 
Now we shall translate Conjecture 1 into the A-B-C-Conjecture due to 
Masser-Oesterlé for K = Q. 

First assume that K is a function field. We can asume that K has a 
prime p,, of degree 1. Let (z),,. be the polar divisor of z. The Riemann- 
Roch theorem implies that there is an element C € K with 


(C)oo| poe (eo) +29(K)—1 and ()oo|(C)o, 


where (Co is the zero divisor of C. Take A= z-C. Then 


(A)oo = ((2)(C)) al (Poo) 80) 429(4)-1 


and 
(A)ol(z)o - (C)o: (z)p* = (2)o- D’ 


with deg D! < 29(K)—1. 

Let So(x) be the set of primes p € Xx dividing p..-D’. Then A and Care 
relatively prime elements in Og,(z), and the same is true for B := A—C, 
which is equal to (x — 1)C,, and deg(gcd((A)o,(B)o)) < 29 —1. If K is 
a number field of genus g(K), we use the theorem of Minkowski to get a 
corresponding result: There are elements A, B,C € Ox such that sx = A/C 
and xs —1= B/C with deg(gcd((A),(B)) and deg(ged((A), (C))) bounded 
linearly in g(K). Hence Conjecture 1’ becomes: 
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A-B-C-Conjecture. Let Ox be the ring of integers of K (with respect to 
a fixed place p.. of K in the function field case). Let A and B be elements 
in Ox, let D(A, B) be their greatest common divisor in Ox, and assume 
that K/K(A/B) is separable. Then 


hx(A)<(1+e)deg( []  p) +d(e,deg(g(K), D(A, B))), 
p|AB(A—B) 
and the dependence on deg(D(A, B)) and g(K) should be linear. 
For K = Q, this means 
lal<dey{ TI] ») 


p|AB(A—B) 


Ite 


whenever gcd(A, B) = 1, with d(e) = e#©-)) (this is the original form of 
the conjecture of Masser-Oesterlé). 


Remark 1.1. Of course one would be glad to have any bounds c and d at 
all. But this sharp version has the advantage that an effective version of 
Faltings’ theorem about the finiteness of K-rational points on curves of 
genus > 2 would follow (see [E]). 

Now we fix a finite set Sg of non-archimedean places of K and let sg := 


deg( [[ p). We assume that 
peso 


a,b,ce€ K* satisfy supp(a)supp(d) supp(c) | [| »°. 
pe So 


Let n be an integer prime to char(K) and define 


az” — by” = cz”, and 
Lya,b,),n(K) = 4(x,y, 2) € K*\(0, 0, 0); K/Ko(2(£)”) is separable ~ 
if K is a function field 
where ~ means projective equivalence. 


Conjecture 2. (“Asymptotic Fermat Conjecture”) There are numbers 
N(g(K), 80), such that for alln > N(g(K), 80) and all elements (x, y,z) € 
La,,c),n(K), either yz = 0 or {x/z,y/z} C Og,. 


Remark 1.2. If K is a number field or if Ko is finite, it follows from Con- 
jecture 2 that 


U La,b,c),n(K) 
>4 


Tt 
supp(a)supp(b)supp(c)| [] p* 
pesg 


is a finite set, and for n large enough, Lyo,6,c)n(K) consists of triplets with 
coordinates in {0} U {roots of unity}. 
An easy observation is 
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Proposition 1.1. Conjecture 1 implies the Asymptotic Fermat conjecture. 


Proof. We assume that ryz #4 0. We take t; = (a/c)(z/z)” and t2 = 
(b/c)(y/z)” and assume that z 40,1. So by Conjecture 1 we get: 


n( ~ w()+ >d up (2)) < hac(t) + het) 


Up Sia Up is 
ee mete < 2(1 + €)s 9 deg supp(ryz) + d(g(K),€). 


Hence for n large enough and v,(x/z) > 0, it follows that p € So, and the 
same is true for z/x and so z/z € O%,, and the analogous equations for 


(b/c)(y/z)” yield (y/z)” € O§%,. 


Of course it is most interesting to find (a, b,c) such that one can deter- 
mine exactly the elements in Lio,c)n(K); in the last section we shall see 
how Wiles’ theorem about modularity of semi-stable elliptic curves can be 
used to find such examples if K = Q, and especially a = b = c= 1 will 
lead to Fermat’s Last Theorem. 


(b) Conjectures about elliptic curves. 

Now we shall state conjectures about elliptic curves and corresponding 
Galois representations which are, as we shall see below, related to Conjec- 
tures 1 and 2. Since they seem to be interesting for their own sake, we shall 
begin by describing a more general context. 

So let A/K be an abelian variety with conductor Ny, Faltings height 
hx(A) and geometric height hgeom(A). For us it is important that for 
elliptic curves E/K, the height hx(£) is closely related to deg(Az), where 
Ag is the minimal discriminant divisor of E', and hgeom(£) ~ hx(jz) with 
jg the absolute invariant of E (see [Si]). It should be noted that, if K is 
a number field or Ko is finite, then there are only finitely many abelian 
varieties over K with fixed dimension and bounded height. 

Let n be a natural number and let A, be the kernel of the multiplication 
by n-id, in A. We shall always assume that n is prime to char(K). The 
Galois group Gx = Aut(Ksep|K) acts on A,, and so we get a representation 


n: Gg — GL(A,) = GL(2 - dim A, Z/nZ) 


induced by this action. 

In general the image of p4n is “as big as possible” subject to the restric- 
tions coming from the symplectic structure induced by the Weil pairing and 
the decomposition of A into simple factors. 


Definition 1.1. Let H be a finite subgroup of A defined over K and of 
order prime to char(K). 4 is called exceptional if 
(i) there is no subgroup H, with 0 # H, < H such that A is isomorphic 
to A/ Hy, and 
(ii) there is no proper abelian subvariety B of A containing H. 
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Convention. We say that an abelian variety over K has multiplicity 1 if 
the (absolute) simple factors of A are pairwise non-isogenous. 


Question 1. Is there a number N and a number M such that the existence 
of an exceptional subscheme H of order > N in an abelian variety of 
multiplicity 1 implies Rgeom(A) < M ? 

Of course this question only makes sense if one specifies how N and M 
should depend on A and K. A possible guess could be that both depend 
only on the maximum of the dimensions of the simple factors of A and 
on the genus of K or, even stronger, on the irrationality degree of K. It 
may be too early to make more explicit guesses now, and we shall restrict 
ourselves to an observation depending on deep results of Faltings and of 
Masser - Wiistholz (see [B]]): 


Let K be a number field. Assume that we fix d € N and a finite 
set Sg C Xx and that we look at all abelian varieties A/K of 
dimension d whose conductor Ny, has support inside of Sg. Then 
A lies in a finite set of isomorphism classes A of abelian varieties, 
and if A, A’ € A are isogenous then there is a K-rational isogeny 
n: A— A’ with deg(7) bounded by a number depending on K 
and So only. Therefore it follows immediately that numbers N 
and M as in the question exist if we restrict ourselves to abelian 
varieties with simple factors whose dimension and conductors 
are bounded, and that these numbers depend on Sp and, in 
fact, on g(K). 


From now on we look at the special case that the factors of A are elliptic 
curves, and all essential features of the question can be found by looking 
at elliptic curves & or abelian varieties of dimension 2 which are isogenous 
to Ey x Ey with E;/K non-isogenous elliptic curves. We begin with A = E 
an elliptic curve defined over K. 


Definition 1.2. E/K is admissible if char(K) = 0 or K/Ko(jz) is sepa- 
rable. 


Let H be an exceptional K-subgroup of EF. It follows that H is cyclic 
and hence our question leads us to: 


Conjecture 3. There are numbers M(g(K)) and N(g(K)) such that if 
E/K is an admissible elliptic curve with a separable cyclic K-rational 
isogeny of degree N > N(g(K)), then we have hgeom(E) < M(g(K)). 


Remark 1.3. (1) If K is a number field and if Conjecture 3 holds, then 
there exists a number N such that only elliptic curves with complex mul- 
tiplication defined over K have cyclic K-isogenies of degree > N. It seems 
to be not unreasonable to believe this statement. 

(2) In the next section we shall prove that Conjecture 3 is true if K isa 
function field. 
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(3) If we sharpen the rationality condition to “H has a K-rational gener- 
ator,” then it is a theorem of Merel that in the number field case there is 
a number N(d(K)) with N < M(d(K)). 
(4) A (much) weaker version of the stated conjectures is that M depends 
on N as well as on K, and then the corresponding statements are true over 
number fields, (see [Fr 2]). 

As a consequence of (4) one gets: 


Proposition 1.2. Let d be a natural number and p a prime. There exists 
a number l,(d) such that for all n, > I,(d), all number fields K with 
[K : Q] <d and all elliptic curves E/K, if E/K has a cyclic K-isogeny of 
degree |"? , then E- has complex multiplication. 


Proof. Corollary 2 in [Fr 2] states that for p' > 120d, there are only 
finitely many j-invariants jg in fields K with [K : Q] < d such that a 
corresponding elliptic curve E has a K-rational cyclic isogeny of degree 
p'®. Let E; be an elliptic curve over K; corresponding to j;. If E; has no 
complex multiplication, then there is an 1; such that FE; (and no twist of 
E;) has a Kj-rational cyclic isogeny of degree p'?. Then |,(d) = max(I;) 
satisfies the assertion of the proposition. 


Now we assume that dim A = 2. In this case the existence of exceptional 
subgroups is closely related to the geometric fundamental groups of curves 
of genus 2 over K. For a discussion of some aspects of this relation we refer 
to [Fr 3]. 

In the following we restrict our attention to the case that A is not K- 
simple, i.e., A is K-isogenous to EF, x E2 with E;/K being elliptic curves. 
Let H < A be exceptional. Inside of A we have two non-isogenous elliptic 
curves E, £4 intersecting in H’ := E{M E% (which is not exceptional in 
our sense) and A/H’ & E, x Ep. 

The case that H (or H’) contains cyclic subgroups has been discussed 
above. So we shall assume that H’ = E ,,, for some n’ € N, and so E, & Ey. 

Let 7: A— E, x Ep be the quotient map and 4 its dual map whose 
kernel H’ is isomorphic (as Gx-module) to H’. Both H’ and n(H) are 
exceptional, and since hx (A) = hx(E, x E2), we lose nothing by assuming 
that A = FE, x EH, and that A does not contain a proper cyclic subgroup. 

It follows from these assumptions that H is isomorphic to p;(H) = Ei,n 
(where p; is the projection to F;) fori = 1,2 and some N EN. So GH is 
given as the graph of an element a € Isog,. (Fi,n, Eo,N), 


H =A, = {(P,aP); P € E,,n}, 


and the representation pr, , of Gx induced by the action on E1,y is equiv- 
alent to pz, ,. Hence the question about the existence of exceptional sub- 
groups in the special case under consideration is a question about equiv- 
alence of representations on torsion elements of elliptic curves raised by 
Mazur in [Ma 2] for K =Q. 
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We can give a geometric interpretation (see [K-S]). Let X(N) be the 
modular curve parametrizing level-N structures of elliptic curves. Its group 
of automorphisms is equal to SL(2, Z/N)/+id. The triplet (E), F2, a) gives 
rise to a K-rational point on 


a. (SL(2, Z))\X(N) x X(N) =: Zn. 
Here a, (SL(2, Z/N))/ + id is a diagonal embedding 
(SL(2,Z/NZ))/ +id) — (SL(2, Z/NZ))/ + id)? 


induced by the conjugation with a matrix g € GL(2,Z/NZ) satisfying 
detg = « € (Z/NZ)*. The number é is determined by e2.n(aP,aQ) = 
e1,n(P,Q)* for P,Q € Fi,n and e;,n the Weil pairing on points of order 
N of E;. 

One should hope that geometrical properties of the surface Zn. will 
yield results for the questions raised above which can now be expressed 
as follows: Let Zy .(K) C Zn,<(K) be the set of K-rational points P 
corresponding to (£,, H2,a) with E; admissible and E, not isogenous to 
Ey. 


Conjecture 4. There are numbers N(g(K)), M(g(K)) such that for all 
N > N(g(K)) and P € 2), .(K) with corresponding curves E), Ez, one 
has 


max (Ibgeom (E:)) < M(9(K))- 


It is not hard to see, but worthwhile to remark, that for K a number 
field it follows from Conjecture 4 that for N large enough, 24 .(K) = @. 
In the conjecture both N and M depend on K. Another possible guess is 


Conjecture 4’. There is an No, independent of K, such that for N > 
No there is a number M(g(K),N) such that for all P € Zy .(K) with 
corresponding curves E,, Eo, one has max (hgeom (H:)) < M(9(K),N). 


Remark 1.4. In the number field case, Conjectures 4 and 4’ are essentially 
due to Darmon ([Da]). 

Kani and Schanz proved in [K-S] that for N > 13, the surface Zy.- is 
of general type. Hence if K is a number field, one would expect according 
to Lang’s conjecture that the K-rational points of Zy,- are concentrated 
on curves of genus < 1. This and the explicit knowledge of intermediate 
curves between X(N) and X(1) motivates 


Conjecture 4”. (Kani)! If N is a prime > 23 and C C Zn¢ is a curve 
of genus <1, then it is a twisted Hecke correspondence. In particular, CN 


lWe use this opportunity to thank E.Kani for many very valuable discussions and 
hints. 
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Z Nye = (see [K]), and hence Conjecture 4’ follows from Lang’s conjecture, 
if K is a number field and N is a prime > 23. 


We shall see in the next section that in the “generic” case (e.g., K a 
function field over an algebraically closed field of characteristic 0) one can 
guess how curves of higher genus can be embedded into Zn. to come nearer 
to Conjecture 4’. The situation becomes much simpler if we fix one of the 
curves EF; which we denote by Eo from now on. 


Conjecture 5. Let Eo/K be admissible. There are numbers No(g(K), Eo) 
and M(g(K), Eo) such that for N > No(g(K), Eo) and (Eo, E) correspond- 
ing to a point P € Zy .(K), we get have hy (E) < M(g(K), Ep). 


Again Conjecture 5 is true over function fields. A crucial point is the 
fact that pz, y = Pz, imposes strong arithmetical conditions on E. 


Proposition 1.3. Assume that H < E, x Ez is exceptional over K of 
order N? with N >5. Let Np, be the conductor of Ey. 

(1) Ifpe€ Xx does not divide N- Ng,, then Ep ts semi-stable at p. 

(2) If p does not divide Nz,, then max(—vp(jz,),0) =0 mod N. 


Proof. lfip{ N-Ne,, then K(Fiw) = K(E2,n) is unramified at p, and so 
E, has to be semi-stable at p (see [Si]). If p{ Nz,, then Ey,n is a finite 
group scheme at p, and using the theory of Tate curves (see [Si] again) it 
follows that if vp(jz,) < 0, then up(jz,) = 0 mod N. 


As a consequence we see that prime divisors of Nz, which don’t divide 
Nz, give alarge contribution to the height of Ey, and this should contradict 
the conjectured relation between the height of elliptic curves and the degree 
of their conductors which is stated as: 


Conjecture 6. (Height Conjecture for Elliptic Curves) There are con- 
stants c and d(g(K)) such that for all admissible elliptic curves over K we 
have 

hk(E) < cdeg Nz + d(g(K)). 


Moreover, for any € > 0, the numbers c = 3 +e and d= d(g(K),e) should 
work, with « = 0 allowed if K is a function field. 


Remark 1.5. If we replace hx(E) by 75 deg Ag (ie., if we take care only 
of the contribution of the non-archimedean places to the height of E), we 
get: 


Szpiro’s Conjecture. There exist constants c’,d' such that 
deg A < c'deg Ng +d’. 


This conjecture would suffice for many applications. 
Here are some consequences of Conjecture 6. 
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Proposition 1.4. Assume that Conjecture 6 is true over K with constants 
cand d. Let éx be equal to 1 if K is a function field, and equal to [K : Q|+6 
if K ts a number field. 

(i) Let So be a finite set of places of K, so = deg [l,es, p. There is 
a number No(so,c,d,d(K)) such that all admissible elliptic curves E/K 
which are semi-stable outside of Sg and have a K-rational cyclic isogeny of 
degree N > No have potential good reduction at allp € UK. 

(ii) Let P be a K-rational point of an admissible E with order N prime 
to char(K). Then either N < max(5éx, 2c + 2d), or E has good reduction 
at all non-archimedean places of K, and so 77 € Ko if K is function field 
and h,,(E) is bounded (depending on d and d(K)) if K is a number field. 
(iii) Fix an admissible Eg/K. There is a number No(c,d, Ng,,d(K)) =: 
No, such that for all N > No and all admissible elliptic curves E/K with 
(Eo, £) corresponding to a point P € Zy.<(K), we get 


Rewer) Gin < M(e, d, NN: d(K)), 


where hgeom(E)an ts the non-archimedean part of the geometric height of E. 
If E is semi-stable at divisors of N, we can replace Rgeom(E)an by hx(£). 
Ifc = 3 +6, and if d depends only on g(K) and e, we get Conjecture 
5 for pairs (Eo, E) corresponding to P € Zn -(K) with E admissible and 
semi-stable at divisors of N. 


Proof. (i) Let Nz = Ne I,es, p”? be the conductor of E and hence of 
n(E£), where Nz, is prime to Sp and the ny’s are bounded by d(K). Using 
the theory of Tate curves, one gets (see [Fr 1]): If p|Nz, then —vp(jz) — 
Up (Jn(E)) > N +1, and so 


1 uw 
2cdeg Nz + 2d > 2hx(E) > gi + 1) deg Nj, + d(d(K)) 
for some d(d(K)) € R. It follows that for 


N> 12 (deg [[ p?+4t+o- d(d(K))), 
peSo 


we have deg(N/,) = 0. 
(ii) If # has a K-rational point of order N > 5, then E has semi-stable 
reduction at all places p{ N. So if N = p*N’ and N’ > 5, then EF is semi- 
stable at all divisors of p. If this condition is not satisfied but p* > vp(p)+6, 
then again E is semi-stable at g|p. Hence E'/K is semi-stable at all p € Ux 
zs 5 if K is a function field, 

N > 56x = { 


5((K :Q] +6) if K is a number field. 
Hence by using (i) (with 7 corresponding to (P)), we get: If 


N > max(56x, 24c + 24d), 
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then EF has good reduction at all places of K, and the assertion follows. 
(iii) Assume that (Ep, £) corresponds to P € Zy,-(K). If N > 5 it follows 
from Proposition 1.3 that 


Ne =Nz I] pP [| p® 


pIN//vp(je)20// PIN Eg 


with Ni, prime to Ng, and 0 < 6, bounded by numbers depending on 
d(K). Moreover up(jz) = 0 mod N if p|N;. Hence 


a 7p Wee Nes < ce(deg Nip + deg | | p*’) + d'(Nz,, ¢,d)- 
pl|N 


Since deg Ve wf is bounded by a number depending on d(K), it follows 
that for large N (depending on Ng, d(K), c,d), Nj is trivial, and so the 
non-archimedean part of hgeom() is bounded. If E is semi-stable at all 
divisors of N, then Ng = NE [Tp Ne p°P, and hence for large N we have 


Ne = [Tptve, p°?. So in this case hx (E) is bounded. 


Corollary 1.1. Assume that Conjecture 6 is true over K and that K is 
a number field or a function field over a finite field Kg. Then for N large 
enough we get: If E/K is semi-stable at all divisors of N and pen = 
Pko,N, then E is isogenous to Eg. Especially, Conjecture 6 for function 
fields implies Conjecture 5 for function fields. 


(c) Relations between elliptic curves and Fermat equations. 

The considerations in the last section have shown that the arithmetical 
properties of the representations pz, are closely connected with arithmeti- 
cal properties of Ag. If F has K-rational points of order 2, these properties 
can be translated into properties of the X-coordinates of such points as was 
done in various papers by Hellegouarch [He] and the author. The procedure 
is described in detail in [Fr 1], but since it is a crucial point we shall repeat 
it for the convenience of the reader. 

Take « € K*\{0,1}, choose a prime poo € Ux if K is a function field, 
and let Ox be the ring of integers of K (with respect to po in the function 
field case). Now we define 


Eq) :Y? = X(X -1)(X—2). 
Since a ‘3 
: : A ee et 
=: = | psa Oe iat ae 
JE) Jz x? (x ae 1)? ? 


it follows that E;,) is admissible if K/Ko(z) is separable, and that 


hk (jE,)) 2 ¢ + 6hx(z). 
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E\) has good reduction at all primes p not dividing 2-supp(z(z—1)), and 
since for p| supp(z(z — 1)) but p { 2, we have 
2up(z(z—1)) if up(x) > 0, 
Up (J(z)) = 2 : 
Up(Z) if vp(xz) < 0, 
and hence we have bad reduction at all divisors of supp(x(z — 1)) which 
are prime to 2. 
Now choose A,C € Ox such that z = A/C and deg(ged(A,C)) < 


89(9()) (see the discussion of the A-B-C-Conjecture), and take B = A-—C. 
Then 


Eva,py 1 Y* = X(X — A)(X — B) 
is a twist of E(,). Define 
So = {Pp € UK; p|2poo - gcd(A, C)}. 
Let A(4,s) be the discriminant of E(4,2). Then for p ¢ So we have: 


A2 4+ B? — AB 3 
Up(Aa,B) = —Up(Jx) = —Up (2 A2 B2C2 


Moreover we get for p ¢ So that X(X — A)(X — B) mod p has at least two 
different zeros, and hence E48) has semi-stable reduction modulo p for 
all p ¢ So, and has good reduction if p ¢ Sg Usupp(ABC). Hence 


N(A,B) = NE a,8) = II pP II p= II pP II p. 


peESo pi2 peSo p|(ABC) 
pESo 


) = 2up(ABC). 


p¢So 
p|supp(x(z—1)) 
Now we are ready to state 


Proposition 1.5. Assume that Conjecture 6 holds with constants c and 
d. Then for all e € Ryo, 


hx (xz) < 2(c + ) degsupp(x(x — 1)) + d(d, e, d(K)) 
If K is a function field, then e« = 0 ts allowed. 


So the height conjecture for elliptic curves implies the A-B-C-Conjecture 
and the Asymptotic Fermat Conjecture. We continue to look at the curve 
FEva,B) and get with C = A— B: 


Proposition 1.6. For N €N, define 
Sw := So U{p € supp(ABC); 2u,(ABC) #0 mod N}. 


Then Ev4,B),n is @ finite group scheme at allp ¢ Sy, and the Artin con- 


ductor Novas),w Of PEca,py,w 28 equal to 


Nyasn= |] pe. 
pe SnUsupp(N) 
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Corollary 1.2. Assume that (x,y,z) € Lia,,.),n(K), and suppose that 
So C Ux satisfies x,y,z € Os,, {a,b,c} C Of, and the prime divisors of 
supp(gced(z, y,z)) are in So. Then 


rn eZ 
PE aN byN),N po 


p€SoUsupp(N) 


and at p|N, p ¢ So, the subgroup scheme Evazn pys),n 18 finite. 


The disturbing fact is that in our general setting the set Sp can depend on 
(x,y,z). But since the height conjecture, and hence the asymptotic Fermat. 
conjecture, are true in the function field case, the interesting application of 
Corollary 1.2 is to the case that K is a number field. In this case we find a 
fixed set Sx C Ex such that Og, is a principal ideal domain and such that 
supp(S) C Sx (where the degree of These p can be bounded by a number 
depending on g(K)), and hence we can represent elements in Las cn(K) 
up to projective equivalence by (z, y,z) C Os, which are relatively prime. 

Hence we get 


Corollary 1.3. Assume that K is a number field and Sx is as above. 
Then for each (2',y',2') € La,.),n(K) with x'y'2z' # 0, we can find a 
projectively equivalent triple (x,y,z) such that 


rs 
PE ac byN),N P 
péS;Usupp(abcN) 


and Evaz’ by’),N 18 a finite group scheme at p|N, but p { abc. 


Conversely assume that we have an elliptic curve E'/K, a finite set So 
of primes containing all divisors of 2, and N > 5 such that Os, is a prin- 
cipal ideal domain and such that Een is finite at p ¢ So. It follows that 
g(K(E2)) is controlled by deg] [,¢5, p, and hence in our context we can 
assume without great loss of generality that H. C E(K). Then we find an 
equation Y? = X(X — A)(X — B) for E with A, B € Os, relatively prime. 
(Here we use that E has to have semi-stable reduction outside of Sp.) So 
for p ¢ So, we get 


Up(A) = vp(B) = vp(A— B) =0 mod N, 


and hence 
A=art%, B=by%, and C=cz% 


with a,b,c € Of and (z,y,z) relatively prime in Os,. So (z,y,z) € 
Lva,b,<),N(K), and we can summarize: 
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Proposition 1.7. The Asymptotic Fermat Conjecture (Conjecture 2) im- 
plies Conjecture 5 for even N. That is, for given Eo there is a number 
No = No(g(K), Eo) such that for N > No and pgy,n = pz,n, tt follows 
that Eg ts tsogenous to E. Conversely, the Asymptotic Fermat Conjecture 
for a given set So holds tf there are only finitely many odd irreducible two- 
dimensional representations of Gx into GL(2,Z/NZ) (with some N € N, 
N large enough) which are unramified outside of Sp and finite at divisors of 
N. Especially there are no solutions (x,y,z) in Lias,<),n(K) with cyz #0 
and (x,y,z) € Ox relatively prime if there are no odd representations 


p:GxR > GL, Z/NZ) 


satisfying the conditions above with So = {p; p|2abc}. 


It is obvious that the verification of such non-existence theorems as 
discussed above for Gx-representations p needs strong number theoretical 
tools. Since in general the image of p is not solvable, “classical” methods of 
class field theory cannot be applied. Looking for other methods, it is quite 
natural that one gets the idea to use “Langland’s philosophy,” or over Q, 
its beautiful concretization given by Serre’s conjecture, and hence one is 
naturally lead to the theory of modular forms. 


(d) Dependence on d(K). 

Above we have stated conjectures involving constants which should de- 
pend on g(K). This seems to be quite natural for the A~-B-C-Conjecture 
and the height conjecture of elliptic curves, but for other conjectures, for 
instance about the existence of cyclic isogenies, one could hope that the 
constants depend only on d(K). (Cf. Merel’s theorem about the order of 
torsion points of elliptic curves.) Especially this seems to be convincing if 
the objects one looks for are parametrized by a family of curves D, and 
if one can prove that the irrationality degree of D, grows with n. For 
simplicity we restrict the discussion to the case that K is a number field. 

We have used already a bound for the irrationality degree of Xo(N) 
over Q to get finiteness results for isogenies. Now we shall use a result for 
this degree over C of Abramovich? to get corresponding results for elliptic 
curves with isomorphic torsion structures: 

Fix Eo/K and N. Consider the set of elliptic curves E over extension 
fields L of K such that there is a map a: Ey n —> En so that the triple 
(Eo, ,a) corresponds to a point P € Zy,-(L). This set is parametrized 
by a twisted modular curve Xp,,\y,<, defined over K, which is isomorphic 
to X(N) over C. Hence Xp, y,< has the same irrationality degree over C 
as X(N), which is at least equal to the irrationality degree of Xg(NV) over 
C, and in [A] it is proved that d(Xo(N) x C) > N/256. Hence we get: 


27 would like to thank B. Edixhoven for this reference 
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Proposition 1.8. Let K be a number field and E9/K an elliptic curve. 
For every d € N there exists a number No = No(d, Eo) such that for N > 
No, the set 


U {P € Zy,-(L); (Zo, E,a) with E 4 EP corresponds to a triple} 
[L:K]<d 


is finite. Further, for all primes p, there is a number |, = I(p,d, Eo) such 
that for N = p'», the corresponding set is empty. 


In the case of Fermat curves the situation is similar: Let Ca4,-,w be the 
curve given by 
aX’ — by = ¢Z%. 


This is a plane projective curve without singularities over K, and a the- 
orem going back to M.Noether® and proved (in a generalized version) by 
Hartshorne in [H] implies that d(Ca»,-,w) = N. Hence we get: 


Proposition 1.9. Let K be a number field, let a,b,c € K*, letd EN, and 
let N > 2d. Then Ca,b,c,n has only finitely many points of degree < d over 
K, and hence for any finite set Sg of primes of K, the set 


U La,p,e,n (L) 


[L:K]<N/2 
supp(a) supp(b) supp(c)|[T,esq P 


is finite. Further, for any extension field L of K and a fixed prime p, there 
is a number I, such that every solution of Lay. 5'p(L) has, up to projective 
equivalence, coordinates in {0}U p’(L), where p’(L) are the roots of unity 


in L unth order prime to p. 
§2. THE GENERIC CASE 


In this section we shall assume that K in a function field in one variable 
over a perfect field Ko. 


Proposition 2.1. Let E/K be an admissible 'elliptic curve. Then 


hx(B) < 5 deg Nx + (9K) — 1), 


hence Conjecture 6 is true. 


Proof. The height conjecture was proved in the case that char(Ky) = 0 by 
Parshin and in general by Szpiro using the geometry of the elliptic surface 
€ defined by E over Ko using the Bogomolov-Miyaoka-Yau inequality be- 
tween Chern classes of €, and it is an idea and hope of Parshin [P] that this 


31 would like to thank H.Stichtenoth and W.-D.Geyer for this reference 
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kind of proof can possibly be “translated” into the frame of arithmetical 
surfaces over number fields. Here we shall give a very elementary proof 
which uses only Hurwitz’ genus formula for function fields, it can be found 
in [Fr 1] already, and we repeat it for the convenience of the reader. 

First we can assume that jg ¢ Ko and that Ko is algebraically closed. 
The next observation is that it is enough to prove the inequality for a 
quadratic twist of EF and so we can assume that FE is semi-stable at all 
p € Xx with up(jz) < 0. We have to prove: 


deg Ag < 6deg Ng + 12(g — 1). 


“Our assumptions imply: 
If p?|Ng then vp(jz) > 0 and 0 < up(Ag) < 10. If jg = 12° mod p then 
Up(Agz) < 9, and if jg = 0 mod p then up(Ag) < 8. If up(jz) < 0 then 


Up(Az) = —Up(jz) and [K : Ko(jz)| = deg [I po%rG2) =: dy. Let 
Up (je)<0 
Ni, be the square free part of Ng. Now use Hurwitz genus formula to get: 


29(K) —2 > —2d.. + do — deg Np 
+deg( |] p)+2deg( [J p)+deg( [] Pp). 


PINe PINE p2|Np 
vp (3— —-123)>0 Up(Jg)>O vp (Ag )=10 
Define 
0) _ 1) _ 2) _ 
s =deg( [J] p), s@=deg( [[ p), s@=deg( [] pp). 
PINE PINE PINE 
vp (ig —123)>0 tp (jz)>0 vp(Ag)=10 
Then: 


do.—-s  d—s® 


u 
2g —2 > —d — deg N; = = g(?) 
g eg oie 5 + 3 + 35 , 


if 39 9 1 
qdoo < deg Ng + ~5- + ao ~ ae 
and so 


deg Ap ~12(g—1) <6degNp+10deg( |] 9) 
wpe eee 


+ 12deg( Il p) + 12 deg( I] p) 


PINE PINE 
vp (Fg —123)>0 vp (7 = )>0) 


< 6 deg Ne. 
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Corollary 2.1. There are constants dg and d, such that for all function 
fields K over perfect fields Kg and for all x € K\{0,1} with K/Ko(z) 
separable we get: 


1 
hx(z) < 3 deg supp(z(z — 1)) + dig(K) + do, 


hence Conjecture 1’ is true, as well as Conjecture 2 and Conjecture 5. 


We can use the height formula to bound the order of K-rational torsion 
subgroups for elliptic curves defined over K but we can do much better: 

Assume that jg ¢ Ko and that E is admissible. Let 7 : E — E’ 
be a separable cyclic isogeny of order N defined over K. Then (jz, jz) 
generate a separable subfield in K which is isomorphic to the function field 
of Xo(N) x Ko, hence g(K) > g(Xo(N)) ~ % and so N < 129(K), and 
Conjecture 3 is true. Proposition 1.8 can be used to get: N < 256d(K), 
and so we can replace g(K’) by d(K) in Conjecture 3. 

What about Conjectures 4 ff? 

To give a flavour of the kind of problems which arise we simplify by 
assuming that char(Ko) = 0 and that K is algebraically closed. A non- 
constant point P € Zy,-(K) corresponds to a Ko-rational curve Cp on 
Zn.e Whose genus is bounded by g(K) and whose irrationality degree is 
bounded by d(K). Assume that P corresponds to the pair (2, E2) of 
elliptic curves. Then the absolute invariants 7; of EB; satisfy an equation 
over Ko whose degree is a lower bound for the heights of #; and E> over 
K, and this degree is the degree of the image curve of Cp under the natural 
map from Zy,- to P’ x P! induced by the modular interpretation. 

Hence Conjectures 4 and 4’ lead to a question which generalizes in some 
sense Kani’s Conjecture 4” : 


Question 2. Let g (resp. d) be non-negative integers. There exist numbers 
No(g) (zesp. NG(d)) such that for N > No(g) (N > No(d)) and irreducible 
curves CC Zy,¢ with g(C) < g (resp. d(C) < d) it follows that C is either 
a twist of a Hecke correspondence or the degree of its image in FP! x P! is 
bounded by a constant depending on g and N (resp. d and N). 


§3. K=Q 


What can we say about our conjectures if K = Q? 

There is one essential tool to use: The arithmetic of modular curves and 
its applications to elliptic curves over Q. There are two major reasons for 
the relations between modular curves and elliptic curves. Firstly level-n- 
structures are parametrized by points on these curves and secondly elliptic 
curves appear as factors of their Jacobians. 

The great power of the “modular method” is due to major contributions 
of many mathematicians. In our context we have to mention especially 
B. Mazur, J.P. Serre, K. Ribet and above all A. Wiles. 
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That elliptic curves with cyclic isogenies of degree N correspond to 
points on X9(V) was used already in the previous sections. Over Q Mazur 
in [Ma 2] succeeded in proving 


Theorem 3.1. Let E/Q be an elliptic curve with Q-rational cyclic isogeny 
of degree N. Then N < 163. Hence Conjecture 3 is true over Q. 


Moreover Mazur gives the entire list of such curves FE. 

The importance of congruence primes is already emphasized by Mazur 
in [Ma 1]. These primes play a major role for the study of elliptic curve 
E which are factors of Jo(Nz), ie. which are modular elliptic curves; they 
link arithmetical properties with attached Galois representations. 

Here is the definition of these primes corresponding to a modular curve 
E: Let yg : Xo(Nz) — E bea Qrational non-trivial morphism of minimal 
degree. Let wg be the Néron differential of E. Then yRwe = cy fr(z) dz 

oo 
with c, € Z\{0} and fe(z) =q+ >> aig’ with fe € So(Nz)(Z) a newform. 
i=2 
For primes! { pNg one has: Tr(pg,p(o1) = a; mod p where gy is a Frobenius 
element at 1. We call fe the cusp form attached to E. 


Now assume that K is a number field and g(z) = q+ os big’ € So(N, Ox) 
=) 


is an eigenform such that there is a prime divisor p of 7 p with b; = a; mod p 
for alll {pN Ne. Then we say by abuse of language that g is congruent to 
fe modulo p and if g # fr, that p is a congruence prime for fg resp. for 
E. 

The following theorem uses Corollary 1.3 and is a special case of a beau- 
tiful result of K. Ribet [Ribet] in which he proved a large part of Serre’s 
conjecture for odd two-dimensional representations of Gg (cf. [Se]) in the 
modular case: 


Theorem 3.2. (Ribet) Let p be an odd prime. Assume that E is a modular 
elliptic curve which is semi-stable outside of So, let fe be the attached 
newform in So(Nz)(Z). Assume that pp,p is irreducible. Let 


ieee A | 


lESq '€Sq 
uw (ae enna 


Then there is a number No | Ni, and a newform g € S2(No) congruent 
to fe modulo p. 


Finally we come to Andrew Wiles’ [W] celebrated result. The following 
version includes the extension by F. Diamond, who showed that Wiles’ 
original everywhere semi-stability condition could be substantially relaxed: 


Theorem 3.3. (A. Wiles, F. Diamond) Assume that E has a twist E’ 
which is semi-stable at 3 and 5. Then E is a modular elliptic curve. Espe- 
cially it follows that E is modular if Ez C E(Q). 
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As a first application we shall give a “modular version” of the height 
conjecture for elliptic curves and hence of the A-B-C-Conjecture: Assume 
that EF is modular with attached eigenform fg. Then (cf. [Fr 1]) we have 


d 


1 1 
ho(E) = 5 log deg yg — log |c,| — sloe( 3 / re) 


AQ (NeE)@C 


with du = 4n dz dy. 


Lemma 3.1. 
(i) There is a number k > 0 (independent of E.) such that 


1 2 
— du > k. 
7 , |fe|-du > 


XQ (NeE)@C 


Gi) tog( ff elPav) < 000g Ne). 
Xq(NeE)@C 


The (easy) proof of i) can be found in [Fr 1], the much more involved 
proof of ii) is a result due to Mai-Murty (see [MM)]). 


Proposition 3.1. 
(i) Assume that E is modular. Then 2he(E) < log(degyr) + O(1). 
(ii) Assume that E is semi-stable. Then logdegy < 2hg(E£)+ O(log Nz). 


Proof. Obvious using Lemma 3.1 and for the second part, Theorem 3.3 and 
the fact that |c,| is bounded if E is semi-stable (conjecturally |c,| = 1). 


Hence the height conjecture for modular elliptic curves FE is true with 
constant c (and computable d) if for such curves log(deg ye) < 2clog Ne+ 
d’ and conversely for semi-stable elliptic curves the existence of such con- 
stants follows from the height conjecture for elliptic curves. 


Corollary 3.1. The A-B-C-Conjecture for relatively prime numbers A 
and B holds with constants c,d, if for modular elliptic curves E we have: 
deg yr < clogNg+d+log|c,|—3 logk, and its truth (with some constants 
é, d ) is equivalent with an estimate of deg yr of this type for all E with E2 
contained in E(Q). 


Remark 3.1. It is not difficult to show that logdeg ye < O(Ne log Ng) and 
hence we get an exponential version of the A-B-C-Conjecture but this is 
not surprising since by transcendental methods one gets this type of results 
as well. 

There is an obvious relation between yr and congruence primes, and 
more generally, Question 1: 


yr induces a morphism Wz : Jo(Nz) — E. 
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Let wp its dual map. Using theorem 1 we can assume that wg does not 
factor through another elliptic curve and so wz is injective and ker(wz) = 
Ag C Jo(Ne) is an abelian variety. Since wg 0 br = deg wz oid it follows 
that Ye(E)N Ag = Wp(E)degon, and so any prime divisior of deg y= is 
a congruence prime. Moreover we find a subvariety B in Ag/Wr(E)aegon 
and a monomorphism a: Egegy, —~> B such that 


Aa = {(P,aP);(P € Edegy)} 


is an exceptional subgroup of EF x B. 

So a positive answer to Question 1 would have strong consequences for 
the height conjecture Tor elliptic curves, the A-B-C-Conjecture and the 
Asymptotic Fermat Conjecture (remember that dim B < O(.Ng)). 


Example 3.1. Assume that E/Q has prime conductor Ng = |. Using 
Theorem 3 and Theorem 2 and checking small primes one gets: Agll5, 
and so Szpiro’s conjecture is true. To get the height conjecture one would 
need more, namely the Hall Conjecture (cf. [Si]) which predicts that 


log(g2(E)) < d’(e) + (2+) log|Ag| 
for all ce € Ryo. 
Let n be a non zero integer and define 
0 if v2(n) > 4, 
2 if 2 < v2(n) < 3, 
4 if vo(n) _ 1, 
0 if v(m) =0, 


bn = 


Now take A,B € Z relatively prime and assume that A is even and that 
B=3mod 4. We use again theorems 3.1,3.2,3.3 and get: 


Proposition 3.2. The curve Ey4,py : Y* = X(X—A)(X —B) is modular, 
and for primes p > 5 with px, g),p wreducible ( e.g. p > 163) we get: 
fEwas.p) 18 congruent modulo p to g € So(N(A,B),p) with 


NAB) = 2° II 
vu (2-4AB(A—B))#0modp 


This result implies strong conditions for solutions of equations of Fermat- 
type. Take a,b,c € Z\{0} relatively prime. We want to study La,2,c),p(Q) 
for large primes p . We can assume without loss of generality that vo(a) > 
voq(b) + ve(c) and that p > ve(a). 

Define L,a,b,c),p(Q) to be the set of triplets (t,y, z) € Lya,b,c),p(Q) 
with zyz # 0 and assume that ged(z, y, z) = 1. Now apply Proposition 3.2 
with A= az?, B = by? to get: 
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Corollary 3.2. Let p be larger than 163. La,8,c),p(Q) is empty if the genus 
of 


Xp | 2° II I 


ui (2~4abc)#0 modp 


is equal to 0, and there is no triplet (x,y,z) € L(a,b,¢),p(Q) with ve(xz) > 0 
if the genus of 


x II l 


u1(2—4abc)40modp 
is equal to 0. 


Since g(Xo(NV)) = 0 if N < 10 orN = 12, 13,16, 18, 25 one can use this 
corollary together with a careful discussion of small exponents to determine 
L(a,b,<),p(Q) for suitable a,b,c (cf. [R 2]). 

We look only at one special case: a= b= c= 1 and so N = 2 and get 


Fermat’s Last Theorem. Letn be a natural number > 3 andz,y,z€Z 
with 2” — y™ = 2". Then zyz = 0. 


Another interesting case is that fr, ,, (with A =az?, B= by?, A-B= 
cz” as above) is congruent modulo p to a newform go belonging to an elliptic 


co . 
curve Eg with conductor No|N(4,8),p|2°***abc, with go(z) = q+ >> bid’. 


i=2 

Let 1 be a prime not dividing pNo but dividing ABC .Then there is a 
quadratic extension K, of Q; such that E(4,3)(Ki) contains a point of 
order p. Hence Fo(K1) contains a point of order p and since it has good 
reduction at | we get:l > p'/? +1. So for 1 < p!/? the curve Ey,4,5) has 
good reduction and a; = b; mod p. Since |a;| < I/1 A. lb] < 2Vi +1 we 
have a; = b; for p > 4V1+2 . Especially Hy mod | has 4 rational points of 
two-power order . 

Hence we get for 1 prime to Ng and 1 < Min((p — 2)?/16, p'/2) that 
Eo mod I has rational points of order 2. For p large enough (depending 
on No only) it follows that Ep is isogenous to an elliptic curve EQ with 
Eo,2 © Eq(Q). Replace Ey by Eo. Then Ep is given by an equation 


y? = X(X 3 a’)(X ns b’) with supp(a’b’(a’ - b’))|2°e+1 I] i 
Llabc 


There is one case in which we can be sure that gg belongs to an elliptic 
curve: 


Proposition 3.3. Suppose for No = eas | II t) we have g(Xo(No)) = 1 
Llabe 
and that X9(No) does not have 4 Q-rational points of 2-power order. Then 
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there 1s an effectively computable number po(No) such that for p > po(No) 
we have: Lya2,c),p(Q) = 9. 


Now there are only very few modular curves of genus 1 and hence the 
assumption of the proposition is not often satisfied. But it is an observation 
of Mazur that one can enforce the existence of Ey by choosing p large: 

The situation is as above. We assume that p > 163 and apply theorem 
3.3 to get a newform gg € So(No)(K) with Fourier coefficient b; € Ox such 
that for a divisor p of p and prime numbers | { pNo we get: a; = b; mod p. 

Now the degree of K/Q can be bounded by g(Xo(No)) ~ No/6, and since 
the absolute values of the conjugates of b; are bounded by 2I1/? it follows 
that for log(p) > O(.Nolog(No)) we have: a; = & for all 1 < 29(Xo(No)) 
and hence go € S2(No)(Q). 

We summarize: 


Proposition 3.3. Assume that a,b,c are relatively prime integers differ- 
ent from 0 and define No as above. Then there is an effectively com- 
putable number po(No)(< O(Nolog(No))) such that for p > po(No) and 
Lya,b,c),p(Q) # V we get: Xo(No) has an elliptic factor Ey with Ep C 
Eo /(Q) and PEo,p = PE(ax? ,byP),p for (z, y, z) = L(a,6,c),p(Q)- 


Corollary. (Mazur) Let q be a prime. If the equation 2*X? — Y? = qZ? 
has solutions for infinitely many p then g = 17. 


Proof. Xo(q) has to have an elliptic factor given by Y* = X(X —a’)(X—0’) 
and (since it has good reduction outside of g) a’ = 24 and |b’(a’—b’)| =q. 


Finally we observe a consequence of proposition 3.4 which brings our 
conjectures about solutions of equations of Fermat-type and about excep- 
tional subgroups of &, x Ez nicely together: 


Proposition 3.5. Conjecture 2 is equivalent to Conjecture 5, 1.e., for any 
finite set Sg € P the set 


iy Lva,b,0) 1P (Q) 
supp(abc)| ini t 
tESg 
pos 


is finite if and only if the set 


E/Q; a curve Ey with Nz,| J] | and pz, = pep 


Ez C E(Q) and there is a prime p > 5 and 
lESg 


1s finite. 


Proof. One direction is true in general and was proved in Proposition 1.7. 
Conversely assume that Conjecture 5 is true and that there are infin- 


itely many solutions (z,y,z) € La,b,<),p(Q) for p > 5, supp(abc)| [T /. 
lESg 
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Hence there are infinitely many non-isogenous elliptic curves Evaze pyr) 
which are modular with Nioge tyr),p|2"? [] J with no < 5. By Proposi- 
lESg 
tion 3.4 there is one elliptic curve E'/Q of level No dividing 2" [| / with 
lESo 
PEo,p = PEax?,byP),p for infinitely many primes p which contradicts Conjec- 
ture 5. 
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WILES’ THEOREM AND THE ARITHMETIC OF 
ELLIPTIC CURVES 


HENRI DARMON 


Thanks to the work of Wiles [Wi], completed by Taylor-Wiles [TW] 
and extended by Diamond [Di], we now know that all elliptic curves over 
the rationals (having good or semi-stable reduction at 3 and 5) are mod- 
ular. This breakthrough has far-reaching consequences for the arithmetic 
of elliptic curves. As Mazur wrote in [Ma3], “It has been abundantly clear 
for years that one has a much move tenacious hold on the arithmetic of 
an elliptic curve H/Q if one supposes that it is [...] parametrized [by a 
modular curve].” This expository article explores some of the implications 
of Wiles’ theorem for the theory of elliptic curves, with particular emphasis 
on the Birch and Swinnerton-Dyer conjecture, now the main outstanding 
problem in the field. 


1 Prelude: Plane Conics, Fermat and Gauss 


In a volume devoted to Wiles’ proof of Fermat’s Last Theorem, what better 
place to begin this discussion than the Diophantine equation 


C:27%7+y' =1, (1) 


which also figured prominently in Diophantus’ treatise, and prompted Fer- 
mat’s famous marginal comment, more than 350 years ago? 

The set C(Q) of rational solutions to equation (1) is well understood, 
thanks to the parametrization 


t2?-1 2 
ew=(aepax): ” 
giving the classification of Pythagorean triples well-known to the ancient 
Babylonians. The integer solutions are even simpler: there are Nz = 4 
integer lattice points (+1,0) and (0,+1) on the circle of radius 1. 
It has become a dominant theme in number theory that curves such 
as C' ought to be studied over various fields, such as the real or complex 
numbers, the finite fields F,, and the p-adic fields Q,, for each prime p. 
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The solutions to (1) in R? describe the locus of points on the circle of 
radius 1. A natural measure of the size of this solution set is the circum- 
ference of the circle: Ng = 27. 

The set of F,-valued solutions C(F,) is finite, of cardinality N,. Let 
ap = p—N,. Is there a convenient formula for N,, or equivalently, for 
ap? Letting ¢ run over the values t = 0,1,2,...,p— 1,00 € Pi(F,) in the 
parametrization (2) gives p+1 distinct points in C(F,), with one important 
caveat: if t? +1 = 0 has a solution to € F,, then the values t = +tg do not 
give rise to points over F,. Hence, if p is odd: 


_ fj +1. if —1 is a square mod p; 
“p~\ _1 if -1 is not a square mod p. 


The condition which determines the value of a, might seem subtle to the 
uninitiated. But much the opposite is true, thanks to the following result 
which is due to Fermat himself: 


Theorem 1.1 (Fermat) [fp 1s an odd prime, 


_f +1 i p=1 (mod 4), 
P™ | -1 if p=3 (mod 4), 


and ag = 0. 


The computational advantage of this formula is obvious. It now suffices to 
glance at the last two decimal digits of p to determine whether N, is equal 
top—lorp+l. 
Let 
L(C/Q, 8) = [[(1—app~*) 
Pp 

be the “Hasse-Weil zeta-function” associated to C. Thanks to Fermat’s 
theorem 1.1, one has: 


Corollary 1.2 The Hasse- Weil L-function L(C/Q, s) ts equal to a Dirich- 
let L-series L(s,x), where x : (Z/4Z)* —~ +1 is the unique non-trivial 
quadratic Dirichlet character of conductor 4. In particular, L(C/Q,s) has 
a functional equation and an analytic continuation to the entire complex 
plane. 


More precisely (see [Was] chapter 4), setting 


nea.) = (4)""r (244) xe/a,9, 


we have: 


A(C/Q, s) = A(C/Q, 1 — s). (3) 
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The special value L(C/Q, 1) is given by: 
L. ot w 
= — 1 —_--— —- arte ce ~, 
L(C/Q, 1) = L(1,x) abe A (4) 


The formal equality L(C/Q,1)“=”" IL Np equation (4) can be rewritten 
in the suggestive form: 


N. 
[[ = - pr =2Nz, (5) 
Pp 
P 
a formula which suggests a mysterious link between the solutions to C' over 
the reals, the finite fields F,, and the integers. The proof that we have 
sketched, although quite simple, does little to dispel the mystery. 


Another example which was also at the center of Fermat’s preoccupa- 
tions is the Fermat-Pell equation 


H .2? — Dy? =1, (6) 


where D is a positive square free integer. (H is for “hyperbola.”) Assume 
for simplicity that D is congruent to 5 mod 8. 

Defining the integers N, and a, = p — N, as before, one finds that for 
p not dividing 2D, 


pete +1 if D is a square mod p, 
Pp“) -1_ if D is not a square mod p. 


Extend the definition of a, by setting a, = 0 if p|2D. By Gauss’s theorem 
of quadratic reciprocity: 


Theorem 1.3 (Gauss) Let 

Xp :(Z/DZ)* — +1 
be the even (non-primitive) Dirichlet character of conductor 2D defined by 
XpD(n) = (B). Then a, = xv(p). 
Define the Hasse-Weil £-function L(H/Q,s) = II, ¢ 1—app~*)—! as before. 
Corollary 1.4 The function L(H/Q,s) is equal to the Dirichlet L-series 


L(s,xXp), so that it has a functional equation and an analytic continuation 
to the entire complex plane. 


The precise functional equation, similar to equation (3), can be found in 
[Was], chapter 4. As before the value L(H/Q, 1) can be evaluated in closed 
form (see [Was], theorem 4.9): 


I(H/Q,1) = L(1,xp) = yen sig Y role) - CBI, (7) 
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where Cp = e?™*/P is a primitive D-th root of unity. 

To gain further insight into the arithmetic significance of this special 
value, one uses the following result of Gauss, which is a primary ingredi- 
ent in one of his proofs of quadratic reciprocity, and is in fact essentially 
equivalent to it. 


Theorem 1.5 Every quadratic field is contained in a cyclotomic field gen- 
erated by roots of unity. More precisely, the quadratic field Q(VD) is con- 
tained in Q(Cp), and the homomorphism of Galois theory 


Gal(Q(¢p)/Q) = (Z/DZ)* — Gal(Q(VD)/Q) = +1 
is identified with-the Dirichlet character-xp- 


One of the applications of theorem 1.5 is that it gives a natural way of 
finding units in Q(VD), and thereby solving Pell’s equation. Indeed, the 
cyclotomic field Q(¢) is equipped with certain natural units, the so-called 
circular units. These are algebraic integers of the form (1 — ¢#) if D is 
not prime, and of the form ae if D is prime, with a € (Z/DZ)*. In 
particular, theorem 1.5 implies that the expression 


D 
up = [Ja = C3) 8x0 (2) 
a=1 


is an element of norm 1 in the quadratic field Q(VD), and in fact, in the 
ring Z[VD]. Hence, formula (7) can be rewritten: 

1 
V4D 


where (Zo, yo) is an integer solution to equation (6). The non-vanishing of 
L(1,xp), (or, equivalently, by the functional equation, of L’(0, xp)) implies 
that this solution is non-trivial. 


L(H/Q,1) = log |zo + yoV DI, (8) 


Remark: A natural generalization of theorem 1.5, the Kronecker- Weber 
theorem, states that every abelian extension of the rationals is contained 
in a cyclotomic field. The norms of circular units always give a subgroup 
of finite index in the group of units of L. 


2 Elliptic Curves and Wiles’ Theorem 


Let £/Q be an elliptic curve over the rationals of conductor N, given by 
the projective equation 


y?z + ayryz + agyz? = z+ agn?z + azz? + agz?. (9) 
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By the Mordell-Weil theorem, the Mordell-Weil group E(Q) is a finitely 
generated abelian group, 


E(Q)~Z' @T, 


where T is the finite torsion subgroup of #(Q). Paraphrasing a remark 
of Mazur ((Ma3], page 186), there are resonances between the problem 
of studying integer points on plane conics and rational points on elliptic 
curves. In the basic trichotomy governing the study of curves over Q, these 
Diophantine problems correspond to the only classes of curves having Euler 
characteristic equal to 0. The Euler characteristic y(X) depends only on 
the Riemann surface X (C),. which is topologically-equivalent to a compact 
surface of genus g with s points removed. The Euler characteristic is defined 
by 
x(X) = (2 — 2g) —s. 


2.1 Wiles’ Theorem and L(£/Q,s) 


If p is a prime of good reduction for £, let N, be the number of distinct 
solutions to equation (9) in P?(F,), and set 


Q =pt+1—N,. 


Further, set a, = 1 if E/Q, has split multiplicative reduction, ap = —1 if 
E/Q, has non-split multiplicative reduction, and a, = 0 otherwise. Define 
the Hasse-Weil L-function L(£/Q,s) by the formula 


L(E/Q, s) = [] (b—app* +p)? [] a — app*). 


p\N PIN 


To study the elliptic curve & along the lines of section 1, one needs a 
better understanding of the coefficients a,, allowing an analysis of the L- 
function L(Z/Q,s). This is precisely the content of Wiles’ theorem, stated 
here in a form which is analogous to theorems 1.1 and 1.3. 


Theorem 2.1 ({Wi], [TW], [Di]) Assume E has good or semi-stable re- 
duction at 3 and 5. Then the coefficients a, are the Fourier coefficients of 
a modular form f of weight 2 and level N which ts an eigenform for all the 
Hecke operators T,. 


This result gives has the following elliptic curve analogue of corollary 1.4. 


Corollary 2.2 (Hecke) The L-function L(E/Q, s) 1s equal to the L-func- 
tion L(f,s) attached to the eigenform f. In particular, it has an analytic 
continuation and a functional equation. 
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More precisely, setting 


A(E/Q, 8) = N*/?(2n)~*T(s)L(E/Q, 8), 


we have : 
and 
A(E/Q, 8) = wA(B/Q, 2-8), (11) 


where w = +1 can be computed as a product of local signs. For example: 


Proposition 2.3 If E/Q is a semistable curve, then w is equal to (—1)s*1, 
where s is the number of primes of split multiplicative reduction for E/Q. 


Remark. The statements of Wiles’ theorem given in theorem 2.1 and corol- 
lary 2.2 bear a strong ressemblance to theorem 1.3 and corollary 1.4 re- 
spectively. This is only fitting, as Wiles’ theorem is a manifestation of 
a non-abelian reciprocity law for GL, having its roots ultimately in the 
fundamental quadratic reciprocity law of Gauss. 

More germane to the discussion of section 1, Wiles’ achievement allows 
one to make sense of the special values L(£/Q,s) = L(f,s) even when s is 
outside the domain {Real(s) > 3} of absolute convergence of the infinite 
product used to define L(#/Q,s). This is of particular interest for the 
point s = 1, which is related conjecturally to the arithmetic of #/Q by the 
Birch and Swinnerton-Dyer conjecture. 


Conjecture 2.4 The Hasse- Weil L-function L(£/Q,s) vanishes to order 
r, the rank of E(Q), at s=1, and 


lim(s — 1)~"L(E/Q,s) = 


#11 (E/Q) (det (Fi, PY icijer) wp? ( [ Z »| [| 7: 


where Ill(E/Q) is the (conjecturally finite) Shafarevich-Tate group of E/Q, 
the points P,...,P, are a basis for E(Q) modulo torsion, { , ) is the 
Néron-Tate canonical height, w is the Néron differential on E, and mp is 
the number of connected components in the Néron model of E/Q,. 


Motivated by this conjecture, one calls the order of vanishing of L(E/Q, s) 
at s = 1 the analytic rank of E/Q, and denotes it ran. 

If E is a semistable elliptic curve, then the formula for w given in 
proposition 2.3 implies that t + ran is always even, where ¢ denotes the 
number of analytic uniformizations (complex and p-adic) with which E/Q 
is endowed. Hence, a corollary of conjecture 2.4 is the following parity 
conjecture for the rank: 
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Conjecture 2.5 If E/Q is semistable, then the integer r +t is even. 


A great deal of theoretical evidence is available for conjecture 2.4 when 
the analytic rank ra, is equal to 0 or 1. By contrast, very little is known 
when ord,=1 L(£/Q, s) > 1, and so conjecture 2.4, and even conjecture 2.5, 
remain very mysterious. (Some numerical evidence has been gathered for 
certain specific elliptic curves, such as the curve of rank 3 and conductor 
5077, see [BGZ].) 


2.2 Geometric Versions of Wiles’ Theorem 


To tackle conjecture 2.4 requires an explicit formula for the leading term 
of L(E/Q,s) at s = 1. There are such formulae when the analytic rank 
is 0 or 1. In deriving them, essential use is made of the following geomet- 
ric version of Wiles’ theorem, which may be seen as a direct analogue of 
theorem 1.5: 


Theorem 2.6 Suppose that EF has good or semistable reduction at 3 and 
5. Then the elliptic curve E is uniformized by the modular curve Xo(N), 
1.e., there is a non-constant algebraic map defined over Q: 


o: X(N) — E. 


Here Xo(JV) is the usual modular curve which is a (coarse) moduli space 
classifying elliptic curves together with a cyclic subgroup of order N. Its 
complex points can be decribed analytically as a compactification of the 
quotient 

¥(N)o = H/To(N), 


where I'9(/V) is the usual congruence subgroup of level N of SL2(Z), and 
H is the complex upper half plane of complex numbers 7 with Im(r) > 0. 

The pull-back of the Néron differential w on EF is an integer multiple of 
the differential 2nif(7r)dr = f (q)%, where f is the modular form given in 
theorem 2.1 and g = e?™"": 


Fee ofl (12) 


The integer cis called the Manin constant associated to ¢. When the degree 
of ¢ is minimal among all possible maps X9(V) —> E’ with E’ isogenous 
to E, it is conjectured that c= 1. 

When theorem 2.6 is satisfied, the elliptic curve E is also uniformized by 
other arithmetic curves, the Shimura curves associated to indefinite qua- 
terion algebras. Although somewhat less studied than classical modular 
curves, they are endowed with a similarly rich arithmetic structure. They 
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play an important role in Ribet’s fundamental “lowering the level” results 
(see the article of Edixhoven in this volume). It is also likely that a deeper 
understanding of the arithmetic of elliptic curves might be achieved by con- 
sidering the collection of all modular and Shimura curve parametrizations 
simultaneously. (See for example the remarks in [Ma3].) 

Let N = N+N7 be a factorization of N such that N~ is square-free, is 
the product of an even number of primes, and satisfies ged(N*, N~) = 1. 
Let B be the indefinite quaternion algebra which is ramified exactly at the 
primes dividing N~, and let R be a maximal order in B. The algebra 
B is unique up to isomorphism, and any two maximal orders in B are 
conjugate. (For more on the arithmetic of quaternion algebras over Q, see 
[Vij.) The Shimura curve X, y- is defined as-a (coarse) moduli space for 
abelian surfaces with quaternionic multiplication by R, i.e., abelian surfaces 
A equipped with a map 

R— End(A). 


The curve Xy+,y- is a (coarse) moduli space for abelian surfaces with 
quaternionic multiplication by R, together with a subgroup scheme gener- 
ically isomorphic to Z/N*+Z x Z/N*Z and stable under the action of R. 
Shimura showed that the curves X; y- and X,y+,,- have canonical mod- 
els over Q. Let Jy+ y— be the Jacobian of Xy+ y-. By a theorem of 
Jacquet-Langlands [JL], it is isogenous to a factor of the Jacobian Jo(N) 
corresponding to the forms of level N which are new at the primes dividing 
N~, and hence we have: 


Theorem 2.7 Suppose that E has good or semistable reduction at 3 and 
5. Then E is a factor of the Jacobian Jn+ n-, 1.€., there is a non-constant 
algebraic map dy+,n- defined over Q: 


QN+,N- = J N+,N- — F. 


A nice account of the theory of Shimura curves can be found in [Jo] and 
[Ro]. 


Remark: The case where N~ = 1 corresponds to the case of the usual 
modular curves. In this case, the algebra B is the matrix algebra M2(Q), 
the order R can be chosen to be M2(Z), and an abelian surface with en- 
domorphisms by # is isomorphic to a product A = E x E, where E is an 
elliptic curve. The level N structure on A corresponds to a usual level NV 
structure on E, so that the curve Xj; is isomorphic to X(N). 

In general, there is considerable freedom in choosing the map gy+,n-.- 
One rigidifies the situation by requiring that dy+,n~- be optimal, i.e., that 
its kernel be a (connected) abelian subvariety of Jy+,y-. This can always 
be accomplished, if necessary by replacing & by another elliptic curve in 
the same isogeny class. 
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Likewise, we will always assume in the next section that the morphism 
@ of theorem 2.6 sends the cusp ico to the identity of E, and that the map 
induced by ¢ on Jacobians is optimal. 


3 The Special Values of [(E/Q,s) at s=1 


We now review some of the information on the leading term of L(E/Q, s) 
at s = 1 which can be extracted from the knowledge that E is modular. 


3.1 Analytic Rank 0 


Theorem 3.1 There is a rational number M such that 


L(E/Q,1)=M w, 
E(R) 


where w 1s a Néron differential on E. 


Proof: Let 0,100 be the usual cusps in the extended upper half-plane, and 
let @ be the modular parametrization of theorem 2.6. The theorem of 
Manin-Drinfeld says that the divisor (ico) — (0) is torsion in Jo(N), and 
hence, if @ sends ico to the point at infinity on EF, then (0) is a torsion point 
in &. By composing ¢ with an isogeny, assume without loss of generality 
that ¢(0) = ¢(tco) is the identity element in E(Q). Then the modular 
parametrization ¢ induces a map from the interval [0, ico] (with the points 
0 and ioo identified) to the connected component Ho(R) of E(R). Let Mo 
be the winding number of this map between two circles. By the formula 
for L(£/Q,1) of equation (10), 


L(B/Q,1) = 2ni f° sory == [rw = 2 w= M w, 


C JEo(R) E(R) 


where M = “2/E(R) : Eo(R)]7?. 

The reader should compare theorem 3.1 with equation (4), which also 
expresses the special value L(C/Q,1) as a rational multiple of the period 
20. 

While theorem 3.1 gives some evidence for the Birch and Swinnerton- 
Dyer conjecture, proving that the value of L(E/Q, 1) is the correct one “up 
to rational multiples”, it does not shed much light on the relation between 
M and arithmetic quantities associated to E such as the rank of F/Q and 
the order of Ill(#/Q). 
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3.2. Analytic Rank 1: The Gross-Zagier Formula 


Assume now that the sign w in the functional equation (11) for L(£/Q, s) 
is —1, so that the L-function of #/Q vanishes to odd order. The L-function 
L(£/Q, s) now has an “automatic zero” at s = 1, and one might hope for 
a natural closed form expression for the special value L’(£/Q, 1). 

Rather surprisingly, no really “natural” closed form expression is known. 
Instead, a formula can only be written down after choosing an auxiliary 
quadratic imaginary field K. Let K be such a field, D its discriminant, and 
let x be the associated odd Dirichlet character. Let E©?) be the quadratic 
twist of E, relative to the character x. Consider the L-series 


L(E/K, 8) := L(E/Q, s)L(E™/Q,s). 


There are (at least) two different ways to show that this L-series has an 
analytic continuation and a functional equation relating its value at s and 
2—s. Since FE and FE) are both modular, each of the two factors on the 
right has a functional equation and analytic continuation. Alternately, the 
functional equation for L(E’/K, s) can be obtained by expressing L(£/K, s) 
as the Rankin convolution of the L-series L(f,s) with the L-series of a 
theta-function of weight 1 associated to the imaginary quadratic field K, 
and applying Rankin’s method. (See [GZ], chapter IV). If K is an arbitrary 
quadratic field (not necessarily quadratic imaginary) one has 


Proposition 3.2 The sign wx in the functional equation for L(E/K, s) 
can be expressed as a product of local signs 


WK = [[~. 
Vv 


where w, = +1 depends only on the behaviour of E over the completion 
K,. In particular, 


1. If E has good reduction at v, then wy = 1; 
2. If v ts archimedean, then wy = —1; 


3. If E/K, has split (resp. non-split) multiplicative reduction at v then 
Wy = —1 (resp. wy = 1). 


Heegner Points: 

Just as cyclotomic fields are equipped with certain canonical units (the 
circular units) whose logarithms express the special values of Dirichlet L- 
series, so modular curves and Shimura curves are equipped with a cer- 
tain natural set of algebraic points, the Heegner points associated to the 
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imaginary quadratic field K, whose heights express first derivatives of the 
L-functions attached to cusp forms. 

Let A be any elliptic curve which has complex multiplication by the 
maximal order Ox of K. There are exactly h such curves, where h is the 
class number of K. They are all defined over the Hilbert class field H of 
K and are conjugate to each other under the action of Gal(H/K). 

Assume further that all the primes dividing the conductor N are split 
in the imaginary quadratic field K. By proposition 3.2, this implies that 
wx = -—1, so that the analytic rank of E(K) is odd. 

Under this hypothesis, the complex multiplication curve A has a ratio- 
nal subgroup of order N which is defined over H. This subgroup is not 
unique, and choosing one amounts to choosing an integral ideal of norm 
N in the quadratic field K. Choose such a subgroup C' of A. The pair 
(A, C) gives rise to a point a on X9(N) which is defined over H. It is 
called a Heegner point on Xo(NV) (associated to the maximal order Ox). 
Let Py = $(a) be the image of a on E(H) by the modular parametrization 
@ of theorem 2.6, and let Pe = tracez;% Py be its trace to E(K). The 
point Px (up to sign) depends only on the quadratic imaginary field K, 
not on the choice of A and C’. Hence, its Néron-Tate height is canonical. 

The fundamental theorem of Gross and Zagier expresses the special 
value of L'(#/K,1) in terms of the height of Pr. 


Theorem 3.3 L’(E/K,1) = (I acy Ai) (Pic, Px) /c2u2,|D|2. 


The proof of this beautiful theorem, which is quite involved, is given in 
[GZ]. 


Remarks: 

1. Theorem 3.3 gives a formula for L/(E/Q,1)L(E™)/Q,1), and in this 
sense does not give a “natural” formula for L’(#/Q, 1) alone. 

2. Theorem 3.3 is also true when w = 1. In this case, the twisted L- 
function L(Ee ) /Q,s) vanishes at s = 1, and theorem 3.3 gives a formula 
for L(E/Q,1)L'(E™)/Q, 1). 


3.3. Some Variants of the Gross-Zagier Formula 


The fundamental formula of Gross and Zagier has been extended and gen- 
eralized in various directions in the last years. Let us mention very briefly 
a few of these variants: 


A. Shimura curve analogues: Assume here for simplicity that EF is semi- 
stable so that N is square-free, and that K is a quadratic imaginary field of 
discriminant D with ged(N,D) =1. Let N = NtN7 be the factorization 
of N such that N* is the product of all primes which are split in K, and 
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N~ is the product of the primes which are inert in K. By proposition 3.2, 
the integer N~ is a product of an even number of prime factors if and only 
if the sign wx in the functional equation for L(E/K,s) is —1. Assume that 
wR = —l1. The Gross-Zagier formula given in theorem 3.3 corresponds to 
the case where N+ = N,N~ = 1. Assume now that N~ #1. One can 
then define the Shimura curve Xjy+ j- as in section 2.2. 

The curve Xy+ y- is equipped with Heegner points defined over the 
Hilbert class field H of K, which correspond to moduli of quaternionic 
surfaces with level N+ structure having complex multiplication by Ox, i.e, 
quaternionic surfaces A endowed with a map 


where End(A) denotes the algebraic endomorphisms of A which commute 
with the quaternionic multiplications. By considering the image in the 
Mordell-Weil group £(#) of certain degree zero divisors supported on 
Heegner points in Jy+,y-(H) by én+,y-, one obtains a Heegner point Px 
in £(K), which cannot be obtained from the modular curve parametriza- 
tion @. One expects that that the height of Px can be expressed in terms 
of the derivative L'(E/K,1), in a manner analogous to theorem 3.3. In 
particular, one expects that Px is of infinite order in E(K) if and only if 
L'(E/K,1) # 0. Nothing as precise has yet been established, but some 
work in progress of Keating and Kudla supports this expectation. 


B. Perrin Riou’s p-adic analogue: In [PR], a formula is obtained (when all 
primes dividing N are split in K) relating the first derivative of the two- 
variable p-adic L-function of Z/K to the p-adic height of the Heegner point 
Px. The calculations of [PR] are also quite involved, but on a conceptual 
level they follow those of Gross and Zagier quite closely. 


C. Rubin’s p-adic formula: Let E be an elliptic curve with complex multi- 
plication by Ox. In [Ru], Rubin obtains a formula expressing the derivative 
of the two-variable p-adic L-function of E/K at a point which lies outside 
the range of classical interpolation, to the p-adic logarithm in the formal 
group attached to E over K @Q, of a Heegner point in E(K). The proof 
of this formula uses the theory of elliptic units, as well as the formula 
of Gross-Zagier and Perrin-Riou’s p-adic analogue, in an essential way. A 
striking feature of Rubin’s formula is that it allows one to recover a rational 
point in E(K) as the formal group exponential evaluated on an expression 
involving the first derivative of a p-adic L-function, in much the same way 
that, if y is an even Dirichlet character, exponentiating L’(0,x) yields a 
unit in the real quadratic field cut out by yx. 


D. Formulae for L(E/K,1) when E is a Tate curve: Suppose that E has 
a prime p of multiplicative reduction which is inert in K, and suppose 
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that all other primes dividing N are split in K. Then the sign in the 
functional equation for L(£/K,s) is 1 by prop. 3.2, and one expects no 
Heegner point construction yielding a point on E(K). However, there 
are Heegner points P, € E(H,), where H,, is the ring class field of H of 
conductor p”, constructed from elliptic curves with level NV structure having 
complex multiplication by the orders of conductor p” in Ox. The precise 
construction is explained in [BD2], where it is shown that these points are 
trace-compatible, and that tracey,;4(P1) = 0. Assume to simplify the 
exposition that K has unit group O7 = +1 and class number 1, so that 
H = K, and that the group of connected components of the Néron model of 
E/K at the prime p is trivial. The prime p is totally ramified in H,/K; let 
Dn be the unique prime of A7,, over p, and let ®, be the group of connected 
components of E/K,, at the prime p,; one has 


®, = Z/(p+1)Zx Z/p""Z, 
, := lim®, =2/(p+1)Z~x Z, 


where the inverse limit is taken with respect to the norm maps. 

The main formula of [BD2] relates the image P, of P, in the group ©, 
to the special value L(&/K,1). The norm-compatible system of points P,, 
gives rise to a canonical Heegner element P,, € lim, E(H,), and hence 
to an element P,, in ®,,. As a corollary to the main result of [BD2] one 
obtains: 


Theorem 3.4 The element P,, is non-torsion if and only if L(E/K,1) # 
0. 


The calculations involved in the proof of theorem 3.4 are considerably sim- 
pler than those of [GZ] needed to prove theorem 3.3. The main ingredients 
in this proof are a formula of Gross for the special value L(E/K,1) (gen- 
eralized somewhat in [Dag]) and a moduli description due to Edixhoven 
for the specialization map to the group of connected components of Jo(JV). 
For more details, see [BD2]. 

A precursor of theorem 3.4 for Eisenstein quotients can be found in 
Mazur’s article [Ma2]. 


E. p-adic analytic construction of Heegner points from derivatives of p- 
adic L-functions: Assume for simplicity that E is semi-stable, and that, as 
before, E/Q has a prime p of multiplicative reduction which is inert in K, 
so that it is equipped with the analytic Tate parametrization 


®rate : KX —+ E(Kp), 


where K, := K @Q,. Assume now that L(#/K,s) has sign —1 in its 
functional equation. Let H,, be the compositum of all the ring class fields 
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of K of conductor p”, whose Galois group G,, = Gal(H../K) is canonically 
isomorphic to an extension of the class group A = Gal(H/K) by the group 
(K>),: of elements in Kp of norm 1. By a generalization of the work of 
Gross [Gr2] explained in [Dag], there exists an element £ in the completed 
integral group ring Z[G..] := lim Z[G,] such that 


roy? = £8 /K,x.1) / ff wot @ LH mB a8) 


£|N- 


for all finite order characters x : Gj. —>~ C*. The element £ plays the role 
of the p-adic L-function associated to the anti-cyclotomic Z,-extension in 
this setting. (It really might be more accurate to view it as asquare root of 
the p-adic [-function.) 

Note that if xtr:, denotes the trivial character, then Yt;i,(L) = 0, 
since L(E/K,1) = 0. Hence £ belongs to the augmentation ideal J in the 
completed group ring Z[G,.]. Let L’ be the natural projection of £ in 
I/I? = Goo. One shows (see [BD3]) that L’ belongs to (K}‘)1 C Goo. The 
element £’ in K* should be viewed as the first derivative of the p-adic L- 
function of E/K (in the anticyclotomic direction, at the trivial character). 

Let Px be the Heegner point on E(K) coming from the Shimura curve 
parametrization @y+,y- that was introduced in paragraph A of this sec- 
tion, and let Px be its Galois conjugate. The following theorem is the main 
result of [BD3]: 


Theorem 3.5 Let wy be a local sign which is —1 if E/Q, has split mul- 
tiplicative reduction, and 1 if E/Q, has non-split multiplicative reduction. 
Then _ 

Drate(L’) = +(PK + wpPr). 


Note that, since p|N~, the curve Xy+ y- is never a classical modular 
curve. Like the formula of Rubin described in paragraph C, theorem 3.5 
allows one to recover a global point in E(K’) from the first derivative of a 
p-adic [-function. 

The main ingredients in the proof of theorem 3.5 are the explicit con- 
struction of £ given in [Gr2] and [Dag] and the Cerednik-Drinfeld theory 
of p-adic uniformization of the Shimura curve Xy+,y- [Cer], [Dz], [BC]. 
The details of the proof are given in [BD3]. 


Remarks: 

1. The formulas described in paragraphs D and FE were inspired by some 
fundamental ideas of Mazur, Tate, and Teitelbaum on p-adic analogues of 
the Birch and Swinnerton-Dyer conjecture. The connection with this circle 
of ideas is explained in [BD1]. 

2. There are many other generalizations of the Gross-Zagier formula which 
were not mentionned here because they are not directly relevant to modular 
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elliptic curves: for example, the work of Nekovar [Ne] and Zhang [Zh] 
extending the work of Gross-Zagier and Kolyvagin to modular forms of 
higher weight, replacing Heegner points by higher-dimensional cycles on 
Kuga-Sato varieties. 

3. In connection with the results described in paragraphs C and £, one 
should also mention an intriguing result of Ulmer [Ul], who constructs 
global points on certain universal elliptic curves over the function fields of 
modular curves in characteristic p. Some of the results described above 
(and, in particular, the formula of paragraph E) should extend to the func- 
tion field setting; this extension has some tantalizing similarities, as well 
as differences, with Ulmer’s constructions. 


4 The Birch and Swinnerton-Dyer Conjecture 


4.1 Analytic Rank 0 


For modular elliptic curves of analytic rank 0, one has the following theo- 
rem. 


Theorem 4.1 [f L(£/Q,1) £0, then E(Q) is finite, and so is IN(E/Q). 


There are now several ways of proving this theorem. We will review the 
different strategies, giving only the briefest indication of the details of the 
proofs. 


4.1.1 Kolyvagin’s proof 


It can be divided into three steps. 


Step 1 (Non-vanishing lemma): Choose an auxiliary imaginary quadratic 
field K’/Q such that 


1. All primes dividing N are split in K. 


2. Under assumption 1, the sign wx is —1 and the L-function L(E/K, s) 
necessarily vanishes at s = 1. In addition, one requires that the [- 
function L(E/K,s) has only a simple zero, that is, L’/(E)/Q,1) 4 
0. 


The existence of such a quadratic field K follows from the theorems of 
Bump-Friedberg-Hoffstein [BFH] and Murty-Murty [MM] on non-vanishing 
of first derivatives of twists of automorphic [-series. 


Step 2 (Gross-Zagier formula): Invoking the Gross-Zagier formula (theo- 


rem 3.3), one concludes that the Heegner point Px € E(K) is of infinite 
order. In particular the rank of E(K) is at least 1. 
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Step 3 (Kolyvagin’s descent): In [Kol], Kolyvagin proves the following 
theorem: 


Theorem 4.2 If the Heegner point Px ts of infinite order, then E(K) has 
rank 1 and W1(E/K) is finite. 


Crucial to the proof of theorem 4.2 is the fact that the Heegner point Pr 
does not come alone. Namely, for each abelian extension L/K such that the 
Galois group Gal(L/Q) is dihedral, satisfying gcd(Disc(L/K), ND) = 1, 
there is a Heegner point Py in E(L) and this system of points is norm- 
compatible in the sense that, if DL; C De, then 


tracez, fin Fis _ L(Le/Ly Pry 7 


where @(L2/L1) € Z[Gal(£i/K)| is an element whose definition involves 
the local Euler factors in L(£/K,s) at the primes dividing Disc(L2/L,). 
Kummer theory allows one to construct Galois cohomology classes cr € 
H*(L,T,(£)) from the points Pz, where T,(E) is the p-adic Tate module 
of &. These classes satisfy the same trace-compatibility properties as the 
Py. Kolyvagin calls such a system of cohomology classes an Euler System 
[Ko2], and shows that if the “initial” class cx is non-zero, the rank of E(K) 
is less than or equal to 1 and UI(£/K) is finite. 

We will not go into the details of Kolyvagin’s ingenious argument, re- 
ferring the reader instead to [Kol] and [Gr3] for more details. 


4.1.2 A variant 


The following variant of Kolyvagin’s basic strategy avoids the non-vanishing 
result of Bump-Friedberg-Hoffstein and Murty-Murty, as well as the for- 
mula of Gross and Zagier. It only works, however, for elliptic curves having 
a prime p of multiplicative reduction, and does not prove the finiteness of 
I(£/Q), but only of the p-primary part of II(£/Q). 


Step 1 (Non-vanishing lemma): Choose now an auxiliary imaginary qua- 
dratic field K/Q such that 


1. The prime p is inert in K, and all the other primes dividing N are 
split in K. 


2. By proposition 3.2, the L-function L(E/K,s) has sign wx = 1 in 
its functional equation. One requires also that L(E“)/Q,1) £0, so 
that L(£/K,1) £0. 


The existence of such a quadratic field K follows from a theorem of Wald- 
spurger [Wal] on non-vanishing of the values of twists of automorphic L- 
series. 
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Step 2 (A variant of the Gross-Zagier formula): Invoking theorem 3.4, one 
finds that the element P,, has non-trivial image in ®,. 


Step 3 (A variant of Kolyvagin’s descent): In [BD2], the following theorem 
is proved: 


Theorem 4.3 If the image Poo of Poo in ®g is non-torsion, then E(K) 
has rank 0 and I(E£/K) ® Zp is finite. 


This theorem is proved by a minor adaptation of Kolyvagin’s argument. 
The entire system of points P, is now used to construct a cohomology class 
cx € H'(K,T,(£)), which is part of an Euler system. The non-vanishing 
of #,, translates ato the non-triviality of the class cx, and in fact of its 
image in a certain quotient (the “singular part”) of the local cohomology 
group H'(K,,T,(£)). Such a non-triviality is used to uniformly bound the 
p” Selmer group of F/K, following the ideas of Kolyvagin. 
The details of the argument are explained in [BD2]. 


4.1.3. Kato’s proof 


Recently Kato {[Ka2] has discovered a wholly original proof of theorem 4.1 
which does not require the choice of an auxiliary imaginary quadratic field 
and does not use Heegner points. 

Kato’s argument constructs cohomology classes c, € H 1(L,T(E)), 
where L is a cyclotomic extension of the rationals with discriminant prime 
to N. These classes are constructed from certain elements introduced 
by Beilinson, belonging to the Kg of modular function fields. Defined 
via explicit modular units (Siegel units), these classes yield elements in 
H}(L,Tp(Jo(N))) which are mapped to H'(L,T,(E)) via the map ¢ of 
theorem 2.6. (In particular, theorem 2.6 is also crucial to Kato’s construc- 
tion.) 

Kato’s classes cy obey norm-compatibility properties similar to those of 
Kolyvagin, and hence deserve to be viewed as an Euler system [Kal]. The 
most difficult part of Kato’s argument, given in [Ka2], is to relate the basic 
class cq € H'(Q,T,(E)) (or rather, its localization in a certain quotient 
— the “singular part” — of the local cohomology group H'(Q,,T>p(£))) 
to the special value L(£/Q,1). 


4.2 Analytic Rank 1 


In the case of analytic rank 1, there is: 


Theorem 4.4 Suppose that L(F/Q,1) = 0, but that L’(E/Q,1) # 0. 
Then E(Q) has rank 1, and W(E/Q) is finite. 
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This theorem lies somewhat deeper than theorem 4.1. Presently, the only 
proof follows the basic strategy of Kolyvagin based on the Gross-Zagier 
formula. 


Step 1 (Non-vanishing lemma): Choose an auxiliary imaginary quadratic 
field K/Q such that: 


1. All primes dividing N axe split in K. 


2. The Hasse- Weil L-function L(E/K,s) has a simple zero at s = 1, so 
that L(E)/Q,1) 40. 


The existence of such a quadratic field K follows from the same theorem 
of Waldspurger [Wal] on non-vanishing of values of twists of automorphic 
L-series used in step 1 of section 4.1.2. 


Step 2 (Gross-Zagier formula): Invoking the Gross-Zagier formula (theo- 
rem 3.3), one finds that the Heegner point Px € E(k) is of infinite order. 
In particular, the rank of E(K) is at least 1. More precisely, by analyzing 
the action of complex conjugation on Px, one finds that Px (up to torsion) 
actually belongs to &(Q) in this case, so that the rank of E(Q) is at least 1. 


Step 3 (Kolyvagin’s descent): By theorem 4.2, one concludes that E(k) 
has rank 1 and finite Shafarevich-Tate group. Hence the rank of E(Q) 
is exactly 1, its Shafarevich-Tate group I(#/Q) is finite, and, as a by- 
product, E)(Q) and II(E)/Q) are also finite. 


Remark: When the sign in the functional equation for L(£/Q, s) is —1, the 
class Cg constructed by Kato gives rise to a natural element in the pro-p 
Selmer group of #/Q, defined as the inverse limit lim Sel(Q, F,.). One 
might expect that this class is non-zero if and only if L’/(E/Q,1) #0. A 
proof of this would show that 


L'(E/Q,1) £0 > rank(E(Q)) <1, 


which represents a part of theorem 4.4. The reverse inequality seems harder 
to obtain with Kato’s methods. 
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Index 


A-augmentation, 258, 267 

A-B-C-Conjecture, 2, 529, 544 

A-modules, exact sequence of, 320 

A-representation, 314 

Abelian group, 101 

Abelian varieties, 530 

modular, 94—95 
Abelianization, 248 
Absolute irreducibility, 318-320 
relaxing condition for, 324-325 

Absolute logarithmic height, 35 

Additive groups, 127 

Adelic representations, 168-170 

Admissible representations, 165 

Affine scheme, 381 

Albanese functoriality maps, 368 

Algebra, 121 

Algebra representations, group 
representations versus, 
251-252 

Algebraic integers, 513 

Analytic rank, 554 

Antipode, 126 

Archimedean case, 176—177 

Archimedean representation theory, 
165-167 


Arithmetic theory of elliptic curves, 
17-40 
Arthur-Selberg, trace formula of, 
194-196 
Artin conductor, 212 
Artin L-function, 179 
Artin motives, 73, 96 
Artin symbol, 494 
Artinian object, 397 
Artin’s conjecture, 181 
base change and, 197-204 
Associativity, 122 
Asymptotic Fermat conjecture, 
529-530, 539 
Atkin-Lehner involution, 226 
Atkin-Lehner notation, 331 
Augmentation, 126 
Augmentation ideal, 126 
Augmentation ideal sheaf, 138 
Automorphic cuspidal 
representations, 164, 169 
of weight one, 171 
Automorphic induction, 186-188 
Automorphic representations 
defined, 169 
of weight one, 164-174 
Automorphism group of F, 19 
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Automorphisms 
continuous, 5 
diamond, 68 
Frobenius, see Frobenius 
automorphism 
Auxiliary set of primes, 425 
Azumaya Algebra, 255 


Bad multiplicative reduction, 
elliptic curves with, 116 
Base change, 129 
Artin’s conjecture and, 197-204 
proof of, 196-197 
theory of, 190 
Base change lifting, 192-193 
Base change theory, 192-197 
Bernoulli numbers, 510 
generalized, 509 
Biquadratic reciprocity, 508 
Birch and Swinnerton-Dyer 
conjecture, 37, 554, 557, 562, 
563-566 
Bogomolov-Miyaoka- Yau inequality, 
540 
Brauer group, 33, 102 
Brauer-Nesbitt Theorem, 391 
Bump-Friedberg-Hoffstein theorem, 
563 


Canonical height, 35~36 

Canonical pairing, non-degenerate, 
118 

Carayol’s Lemma, 218, 219 

Carayol’s reductions, 237-239 

Cartan subgroup, 223 

Cartesian diagram, 268 

Cartier duality, 145 

Cartier map, 337 

Cartier-Nishi duality theorem, 388 

Casimir operator, 173 

Cassels’ global duality theorem, 33 

Category, group objects in, 122-125 

Central character, 167, 199 

Central function, 257 

Cerednik-Drinfeld theory, 562 

Change-of-basis matrix, 57 

Character-functions, characterizing, 
257 

Character group schemes, 130 


Characters, 129, 168 
fundamental, 149, 214, 376-377 
representations and, 252-254 
universal, 263 
Chebotarev density theorem, 93, 
249 

Circular units, 552 

Class field theory, first case of 
Fermat’s Last Theorem and, 
499-502 

Class number, 509 
relative, 509 

Classification theorem, 149-152, 

443 
CM (complex multiplication), 19, 
37 
Coboundaries, 101-102 
Coboundary conditions, 103 
Cocommutative Hopf algebras, 126 
Cocycle conditions, 103 
Cocycles, 101-102 
difference, 285 
method of obtaining, 109 
Coefficient-ring, 4, 249 
Coefficient-ring homomorphism, 
249 

Cohen-Macaulay ring, 329 

Cohomological interpretation of 
Zariski tangent A-modules, 
284-287 

Cokernels, 125 

Collection of local conditions, 
113-114 

Commutative algebra, 363-365 

Commutative p-group schemes, 
Raynaud’s results on, 
146-153 

Commutative triangle, 364 

Commutativity, 123 

Compactification of quotient, 555 

Complete intersection rings, 13 

Complete intersections, 343 
criteria for, 343-355 

Complete rings, local, 314 

Completed tensor product, 265 

Complex conjugation, 517 

Complex multiplication (CM), 19, 

37 

Comultiplication, 126 


Conductor, 3, 31-32, 168, 173 
nebentypus character of, 426 
Congruence groups, 467 
Congruence ideal, 343 
Congruence modules, 14, 366-370 
Congruence primes, 543 
Connected components, 139 
Connected-étale exact sequence 
over henselian local ring, 
138-142 
Connectedness, 391 
Constant on orbits morphisms, 135 
Constant S-schemes, 130 
Continuity proposition, 280 
Continuous automorphisms, 5 
Continuous functors, 267 
Continuous Kahler differentials, 
274-276 
Contragredient representation on 
dual space, 74 
Contravariant set functor, 121 
Converse theorem to Hecke theory, 
187-188 
Corank, 337 
Correspondence, 61 
Counit, 126 
Crossed homomorphism, 102 
Cubic reciprocity, 508 
Cup products, 106-107 
Cusp, 18, 45 
Cusp forms, 76 
Maass, 173 
of weight 2, 85-88 
Cuspidal automorphic 
representations, 169 
Cuspidal representations, 161 
Cyclotomic character, 5 
Cyclotomic fields, Kummer’s work 
on, 508-513 
Cyclotomic units, 509 


D-module, finite, 407 
Deck-transformation group, 434 
Decomposition subgroups, 5 
Deformation, 245 
Deformation conditions, 289-291 
Deformation functors 

flat, 373-418 

universal, 394 
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Deformation problem, reduced 
tangent space of, 438 
Deformation rings, 375 
universal, 12 
Deformation theory 
of Galois representations, 
243-309 
local Galois cohomology and, 
397-406 
Deformation types, 11, 421 
Deformations, 4, 259, 313, 425-427 
flat, 324 
of galois representations, 
classifying, 108 
of group representations, 257-259 
infinitesimal, 109 
minimal, 423 
modular, see Modular 
deformations 
ordinary, 304, 323-324 
restrictions on, 323-324 
Degeneracy maps, 368 
Degree of representation, 250 
Deligne theorem, 211-215 
Descent, 35 
of group representations, 254—256 
Determinant, fixed, 110 
global Galois deformation 
problem with, 294 
Determinant conditions, 291-292 
Deuring’s theorem, 38 
Dévissage, 147 
Diagonalizable group schemes, 128 
Diamond automorphisms, 68 
Diamond operators, 8, 77, 211 
Dieudonné modules, 337, 406-407 
Difference cocycles, 285 
Differential forms, 85-88, 234-237 
Dihedral case, 159 
Diophantine equations, 549 
conjectures about, 528-530 
Dirichlet character, 169, 552 
Dirichlet D-series, 550, 551 
Discrete series representation, 177 
Discriminant, minimal, 3, 31 
Dual Hopf algebra, 144 
Dual isogeny, 19 
Dual Selmer group, 439 
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Dual space, contragredient 
representation on, 74 
Duality theory, 33 


Bichler-Shimura-Igusa theorem, 
429 
Fichler-Shimura theory, 7, 8 
“Bisenstein” primes, 432 
Hisenstein series, 24, 26 
Elliptic curves, 2, 17 
arithmetic theory of, 17—40 
with bad multiplicative 
reduction, 116 
conjectures about, 530-536 
elliptic functions and, 24—26 
explicit families of, with galois 
representations mod n, 
449-461 
formal groups of, 26-27 
galois representations mod 5 and, 
471-473 
height conjecture for, 534 
isogenous, 23 
isomorphic, 25 
modular, see Modular elliptic 
curves 
over finite fields, 22-24 
over local fields, 27—29 
relations between Fermat 
equations and, 536-539 
relations with, 527-553 
with same galois representations 
mod N, 450-454 
supersingular, 24 
Weierstrass parameterization of, 
52 
Wiles’ theorem and, 552-557 
zeta functions of, 23 
Elliptic functions, 24, 52-54 
elliptic curves and, 24-26 
Elliptic modular surfaces 
of level N, 450-449 
twists of, 449-450 
Elliptic regulator, 36 
Endomorphism ring of &, 19 
Epsilon conjecture, 1, 9 
Buler characteristics, 108, 553 
Buler product, 185 
Euler System, 564 


Extensions, 108, 288 


Faltings’ construction, 263 
Faltings’ height, 530 
Faltings’ isogeny theorem, 32 
Faltings’ theorem, 34, 470, 529 
Fermat conjecture, asymptotic, 
529-530, 539 
Fermat equations, relations between 
elliptic curves and, 536-539 
Fermat-Pell equation, 551 
Fermat type, ternary equations of, 
527-553 
Fermat’s Last Theorem (FLT), 1, 
527, 546 
first case of, 499-500 
class field theory and, 499-502 
overview of proof of, 1-15 
for polynomials, 507 
proof of, 10 
for regular primes, 513-516 
remarks of history on, 505-523 
second case of, 506 
Shimura-Taniyama-Weil 
conjecture and, 220 
suggested readings on, 521-522 
Fiber products, 269 
representability and, 267 
Filtration, 28 
Finite D-module, 407 
Finite étale S-group schemes, 
136-138 
Finite fields, elliptic curves over, 
22-24 
Finite flat condition, 308 
Finite flat group schemes, 121—153 
Fontaine’s approach to, 406-412 
passage to quotient by, 132-135 
techniques, 379 
Finite Honda systems, 410 
Finiteness at p, 214—215 
Fitting ideals, 344, 345 
Fixed determinant, 110 
Flat deformation functors, 373-418 
Flat deformations, 324 
applications to, 413-418 
Flat group schemes, finite, 121-153 
Flat representations, 116, 375-393, 
398 


Flatness, 383 
FLT, see Fermat’s Last Theorem 
Fontaine-Laffaille modules, 413 
Fontaine-Laffaille theory, 116 
Fontaine’s approach to finite flat 
group schemes, 406-412 
Forms of level N, 83 
Formal group law, 27 
Formal groups of elliptic curves, 
26-27 
Freeness of Hecke algebra, 434—436 
Frey, Gerhart, 1 
Frobenius automorphism, 6 
Hecke correspondences and, 
69-73 
Frobenius conjugacy class, 157 
Frobenius element, 38, 71, 94, 247 
Frobenius morphism, 22 
Frobenius-semilinearity, 407 
Functoriality, 193 
established examples of, 186~190 
Functoriality maps, Albanese, 368 
Functors 
continuous, 267 
defining, 394-397 
flat deformation, 373-418 
representability and, 267-284 
smooth morphisms of, 278 
universal deformation, 394 
Fundamental characters, 149, 214, 
376-377 
Fundamental groups, 137 
Fundamental lemma, 197 


G-module, 101 
Galois cohomology, 101-120 
local, deformation theory and, 
397-406 
Galois deformation conditions, 
294~-296 
Galois deformation problem, global, 
294 
Galois-equivariant pairing, 333 
Galois extension, 42 
Galois groups, 246 
local, 5 
Galois-invariant vector, 441 
Galois representations, 2, 3-7, 212, 
246-251 
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arising naturally, 250-251 
associated to newforms, 7—8 
attached to EB, 20-21 
classification of, by j-invariant of 
E, 493-498 
classifying deformations of, 108 
deformation theory of, 243-309 
determinants of, 4 
global, 5-6 
local, 476-479 
local behavior of, 475 
local properties of, 5—6 
mod 3, 231-233 
mod 4, 456-457 
mod 5 
elliptic curves and, 471-473 
modularity of, 463-473 
mod n, explicit families of elliptic 
curves with, 447-461 
mod p, 157-158, 220 
modular, 2, 7-9 
modularity of, 8 
in number theory, 6 
ordinary, 11 
p-adic, 6—7 
remarkable, 7 
residual representations of, 4 
semistable, 11 
unramified, 6 
Gauss’s theorem, 551 
General linear groups, 127 
General power reciprocity law, 501 
Generalized Bernoulli numbers, 509 
Generalized Selmer groups, 
111-113, 365 
Geometric height, 530 
Geometric versions of Wiles’ 
theorem, 555-557 
Global duality theorem, 33 
Global field, 527 
Global Galois deformation problem, 
294 
with fixed determinant, 294 
Global galois representations, 5-6 
Global L-series, 32 
Global minimal Weierstrass 
equation, 31 
Good reduction, 29 
Gorenstein condition, 349 
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Gorenstein property, 12, 327-330 
importance of, for Hecke algebra, 
327-341 
Goursat’s Lemma, 253 
Gross-Zagier formula, 558-559, 563 
566 
variants of, 559-563, 565 
Grossencharacters, 169 
Grothendieck’s semistable 
reduction theorem, 74 
Grothendieck’s theorem, 135, 277 
Group, 101 
Group functors, T24 
Group law, 18 
Group-like elements, 130 
Group objects, 123 
in category, 122-125 
Group representations, see also 
Representations 
algebra representations versus, 
251-252 
deformations of, 257-259 
descent of, 254-256 
Group schemes, 121, 125-132, 381 
diagonalizable, 128 
Gunderson’s theorem, 500 
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Haar measure, 81 
Hall Conjecture, 545 
Hasse invariant, 209 
Hasse-Weil conjecture, 73-75 
Hasse-Weil L-function, 32, 550, 
551, 566 
analytic rank 0, 557 
analytic rank 1, 558-559 
special values of, 557-563 
Wiles’ theorem and, 553-555 
Hasse-Weil zeta-function, 550 
Hasse’s theorem, 23 
Hecke algebras, 8, 89-94, 330-331, 
359-361, 482-483 
freeness of, 434-436 
importance of Gorenstein 
property for, 327-341 
Hecke congruence subgroup, 168 
Hecke corollary, 553 
Hecke correspondences, 61-73 
Frobenius automorphism and, 
69-73 


moduli interpretation of, 63-66 
on upper half-plane, 66—67 
Hecke eigenform, 80 
Hecke-Jacquet-Langlands 
L-function, 179 
Hecke L-series, 38 
Hecke operators, 7, 78-81, 211, 429 
Hecke rings, 375, 427-430 
universal deformation rings and, 
421-444 
Hecke theory, Converse theorem to, 
187-188 
Heegner points, 558-559 
p-adic analytic construction of, 
561-562 
Height conjecture for elliptic 
curves, 534 
Heights, 35, 530 
Henselian local ring, connected- 
étale exact sequence over, 
138-142 
Hensel’s lemma, 26 
Herbrand quotient, 523 
Herbrand’s theorem, 518-519 
Hermite-Minkowski theorem, 117 
Hilbert class field, 38 
Hilbert space, 165 
Hilbert symbol, 102 
Hilbert’s theorem 90, 104, 523 
Hilbert’s theorem 94, 523 
Hochschild-Serre spectral sequence, 
105 
Holomorphic form, 173 
Hom, 103-104 
Homomorphisms 
coefficient-ring, 249 
crossed, 102 
k-algebra, 141 
lifting, to matrix groups, 317-318 
Homothetic lattices, 25 
Honda systems, 438 
finite, 410 
Hopf algebras, 121, 125-126 
dual, 144 
Hurwitz’ genus formula, 42, 235, 
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I-ordinary cohomology group, 305 


I-ordinary cohomology submodule, 
306 
I-ordinary representation, 304-305 
Icosahedral case, 159 
Ideal numbers, 505 
Ideals, 508 
Idele class group, 169 
Index of H in G, 135 
Inertia group, 5, 101, 376 
Infinitesimal deformations, 109 
Inflation-restriction proposition, 
105 
inflation-restriction sequence, 117 
Integral points, 39-40 
Intersection rings, complete, 13 
Intersections, complete, see 
Complete intersections 
Invariant subspace, 166 
Inverses, 123 
Irreducibility, 161, 257 
absolute, see Absolute 
irreducibility 
supersingular case and, 377 
Irreducibility theorem, 464 
proof of, 470 
Irregular primes, 509 
Isogenies, 19 
kernels of, 121 
Isogenous elliptic curves, 23 
Isogeny theorem, 423 
Isomorphic elliptic curves, 25 
Isomorphism classes, 46 
Iwasawa decomposition, 166 
Iwasawa theory, 519 


J-function, 50 
j-invariants, 17 
classification of galois 
representations by, 491-498 
modular, 454-455 
j-minimal curves, 493 
Jacobian variety, 71 
Jacobi’s formula, 26 
Jacquet-Langlands correspondence, 
221, 487 
Jacquet-Langlands theorem, 556 
Jordan-Holder factors, 391-392 


K-algebra homomorphisms, 141 
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K-finite vectors, 165 
k-representation, 313 
k-vector space, 351 
Kahler differentials, 210, 236 
continuous, 274-276 
Kamienny, Mazur, Merel theorem, 
34 
Kani’s conjecture, 542 
Kato’s proof, 565 
Katz’s definition of modular forms, 
209 
Kernel of Norm, 104 
“Kernels, 124—125 
of isogenies, 121 
Klein 4-group, 198 
Kodaira-Spencer isomorphism, 217, 
234 
Kodaira symbols, 492 
Kolyvagin’s descent, 564, 566 
variant of, 565 
Kolyvagin’s proof, 563-564 
Koszul complexes, 346 
Kronecker-Weber theorem, 113, 552 
Krull-Schmidt-Akizuki theorem, 
338 
Krull topology, 4 
Kuga-Sato varieties, 563 
Kummer congruence, 522 
Kummer generators, 105 
Kummer sequence, 29 
Kummer’s lemma, 512 
Kummer’s work on cyclotomic 
fields, 508-513 


L-functions, 73-99, 192 
L-groups, 183 
L-morphism, 185 
E-series, 558 
Rankin convolution of, 558 
Langlands class, 171, 174, 183, 184 
Langlands functoriality conjecture, 
175-176 
statement of, 185-186 
theory and results, 182-191 
Langlands L-factors, 184 
Langlands parameter, 175, 182 
Langlands program, 175~-191 
Shimura-Taniyama-Weil 
Conjecture and, 190-191 
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Langlands reciprocity conjecture 
(LRC), 164, 179-182 
Langlands theory, 14 
Langlands-Tunnell form, 230-239 
Langlands-Tunnell theorem, 
158-159, 230-231 
proof of, 192-204 
reformulation of, 179-180 
Lang’s conjecture, 533-534 
Laplace-Beltrami operator, 173 
Lattices, homothetic, 25 
Law of composition, 122 
Left actions, 136 
Leopoldt Conjecture, 262 
Lie algebras, 406 
Lifting homomorphisms to matrix 
groups, 317-318 
Liftings, 4, 216 
strictly equivalent, 258 
LLC (local Langlands conjecture), 
175 
Local behavior of galois 
representations, 475 
Local complete rings, 314 
Local conditions, collection of, 
113-114 
Local duality theorem, 33 
Local fields, elliptic curves over, 
27-29 
Local Galois cohomology, 
deformation theory and, 
397-406 
Local galois groups, 5 
Local Galois representations, 
476-479 
Local Langlands conjecture (LLC), 
175 
Local Langlands correspondence, 
175, 176-179 
Local Tate duality, 107-108 
Local terms, computation of, 
439-442 
LRC (Langlands reciprocity 
conjecture), 164, 179-182 


Maass cusp form, 173 
Magmas, 122 

Manin constant, 555 
Manin-Drinfeld theorem, 557 


Masser-Oesterle A-B-C conjecture, 
2, 529, 544 
Matrix algebra, 556 
Matrix groups, lifting 
homomorphisms to, 317-318 
Mayer-Vietoris property, 270, 277 
Mazur’s corollary, 547 
Mazur’s modular lifting conjecture, 
158 
Mazur’s result, 225~—227 
Mazur’s theorem, 34 
Merel’s theorem, 532, 539 
Meromorphic function on 
punctured disk, 49 
Minimal case, 14 
Minimal deformations, 423 
Minimal discriminant, 3, 31 
Minimal ramification conditions, 
300 
Minimal Weierstrass equation, 27 
Minimality condition, 422 
Minimally ramified liftings, 480-481 
“Mittag-Leffler” argument, 431 
Modified Tate cohomology group, 
101 
Modular abelian varieties, 94—95 
Modular curves, 41-60, 466—470 
classical theory of, 468 
of level N, 450-449 
as quotients of upper half-plane, 
58-60 
twists of, 449-450, 469 
Modular deformation ring, 
universal, 12 
Modular deformations, Wiles’s 
“main conjecture” and, 
357-370 
Modular elliptic curves, 95, 543 
explicit families of, 454-461 
Modular forms, 2, 76-78 
Katz’s definition of, 209 
lifting, 216 
Modular functions, 49-51 
Modular galois representations, 7~9 
Modular j-invariants, 454-455 
Modular lifting conjecture, Mazur’s, 
158 
Modularity, 155—204 


of galois representations mod 5, 
463-473 
Modularity Conjecture, 1, 9-10 
semistable, 11 
Modularity theorem, 464 
proof of, 470-471 
Moduli interpretation of Hecke 
correspondences, 63-66 
Moduli scheme parametrizing 
triples, 210 
Mordell Conjecture, 470 
Mordell-Weil group, 553 
Mordell-Weil theorem, 29 
proof of, 36 
Multiplication maps, 18 
Multiplicative groups, 127 
Murty-Murty theorem, 563 


N-division point representation, 
250 
Nagell-Lutz Theorem, 460 
Nakayama’s Lemma, 138, 308 
Nearly representable element, 277 
Nebentypus character of conductor, 
426 
Néron differential, 554 
Néron model, 380 
Néron-Ogg-Shafarevich criterion, 
28 
Néron property, 225, 226 
Néron-Tate canonical height, 35-36, 
554 
Newforms, 7, 173, 428 
galois representations associated 
to, 7-8 
of level N, 83 
Node, 18 
Noetherian rings, 121, 313 
Non-minimal case, 14 
Non-split reduction, 27 
Non-vanishing lemma, 564 
Norm, 101 
kernel of, 104 
Normalized basis, 55-57 
Number theory, galois 
representations in, 6 


Octahedral case, 159, 202—204 
Old forms of level N, 83 
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Ordinary deformations, 304, 
323-324 

Ordinary galois representations, 11 

Ordinary representation, 304-305 


P-adic analogue, Perrin-Riou’s, 560 
p-adic analytic construction of 
Heegner points, 561-562 
p-adic case, 177-179 
p-adic galois representations, 6—7 
p-adic L-function, square root of, 
562 
p-adic numbers, 510 
p-adic representation theory, 
167-168 
p-class group, structure of, 517-521 
p-finiteness condition, 246 
Pell’s equation, 552 
Perrin-Riou’s p-adic analogue, 560 
Petersson inner product, 81-83 
Picard functor, 225 
Poitou-Tate proposition, 119 
Polynomials, Fermat’s Last 
Theorem for, 507 
Potential good reduction, 29 
Primes 
auxiliary set of, 425 
congruence, 543 
“Hisenstein,” 434 
irregular, 509 
regular, see Regular primes 
Primitive forms, 7 
of level N, 83 
Pro-finite flat condition, 308 
Pro-representable hull, 278 
Profinite group, 313 
Projective limits, 320-323 
Prolongations, 146 
unicity of, 153 
Pseudo-characters, 257 
Pseudo-representation, 257 
Punctured disk, meromorphic 
function on, 49 
Pythagorean triples, 549 


Quantum groups, 126 
Quasi-period of lattice L, 26 
Quaternionic surfaces, 560 
Quotient 
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Quotient (continued) 
compactification of, 555 
Herbrand, 523 
passage to, by finite flat group 

schemes, 132-135 
unramified, 179 
of upper half-plane, 57-58 
modular curves as, 58-60 
Quotient ring, 89 


Ramakrishna’s theorem, 396-397, 
403-406 
proof of, 374 
Ramakrishna’s theory, 292-294 
Ramification conditions, minimal, 
300 
Rank, analytic, 556 
Rankin convolution of L-series, 558 
Rankin-Selberg L-function, 201 
Rankin-Selberg method, 97 
Rankin-Selberg products, 189-190 
Rational torsion, 34 
Raynaud F-module scheme, 148 
Raynaud’s results on commutative 
p-group schemes, 146-153 
Reduced tangent space of 
deformation problem, 438 
Reduction of #', 27 
Regular primes, 509 
Fermat’s Last Theorem for, 
513-516 
Relative class number, 509 
Relative Zariski tangent A-modules, 
286 
Relative Zariski tangent space, 
274-276 
Relatively representable 
subfunctors, 278-279 
Representability 
fiber products and, 267 
functors and, 267-284 
strong near, 282 
weak near, 281 
Representations, see also Group 
representations 
adelic, 168-170 
admissible, 165 
characters and, 252-254 
residual, 4, 258, 259 


Residual representations, 4, 258, 
259 

Restrictions on deformations, 
323-324 

Ribet, Ken, 1 

Ribet’s theorem, 9, 227-230 

Riemann-Roch formula, 217 

Riemann zeta function, 26 

Right invariance, 132 

Ring, 121 

Rosati involution, 333 

Rubin’s p-adic formula, 560 


S-group schemes, 125 
Satake isomorphism, 194, 196 
Scheme of left cosets of H in G, 135 
Schemes, 121 
Schlessinger’s Criteria, 262-263 
Schlessinger’s representability 
theorem, 276-278 
Schur-type theorems, 254-256 
Schur’s Lemma, 252 
Selmer groups, 13, 30, 365-366, 
436-439 
dual, 439 
generalized, 111-113 
Semisimple ring, 91 
Semistability, 3, 376 
Semistable galois representations, 
11 
Semistable Modularity Conjecture, 
11 
Semistable reduction, 455-456 
Separable closure, 101 
Serre, Jean-Pierre, 1 
Serre duality, 217 
Serre’s conjectures, 8-9, 209-239 
cases for, 222-224 
statement and results, 209-222 
Serre’s theorem, 34 
Shafarevich-Tate group, 30, 554 
Shapiro’s lemma, 434 
Shimura curves, 221, 556 
analogues, 561-562 
Shimura-Taniyama conjecture, 
97-99 
Shimura-Taniyama-Weil conjecture 
Fermat’s Last Theorem and, 220 
Langlands program and, 190-191 


Siegel’s theorems, 39 

Singular cubics, 18 

Smooth morphisms of functors, 278 

Snake Lemma, 110 

Space of new forms of level N, 83 

Space of old forms of level N, 83 

Split reduction, 27 

Square root of p-adic L-function, 
562 

Stickelberger’s theorem, 506, 518 

Strictly equivalent liftings, 258 

Strictly free actions, 134 

Strong Artin conjecture, 164, 179 

Strong finiteness, 286 

Strong multiplicity one, 174 

Strong near representability, 282 

Subfunctors, relatively 
representable, 278-279 

Supersingular case, 373 

irreducibility and, 377 

Supersingular elliptic curves, 24 

Supersingular points, 233 

Sylow subgroup, 148 

Symmetric square lifting, 188-189 

Szpiro’s conjecture, 3, 534 


Tame ramification group, 213 

Tame ramification theory, 492 

Taniyama~-Weil conjecture, see 
Shimura-Taniyama-Weil 
conjecture 

Tannakian approach, 248 

Tate cohomology group, modified, 
101 

Tate curves, 34-35, 210 

Tate duality, local, 33, 107-108 

Tate module, 6, 20, 357 

Tate period, 423 

Tate-Poitou exact sequence, 439 

Tate’s local duality theorem, 33, 
107-108 

Taylor, Richard, 1 

Taylor-Wiles-Faltings criterion, 
430-432, 485 

Teichmiiller lift, 216 

Ternary equations of Fermat type, 
527-553 

Tetrahedral case, 159, 198-202 

Topological generator, 104 
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Topological ring, 4 
Torsion points, 20 
Torsion subgroups, 457-460 
Trace formula, 192 

of Arthur-Selberg, 194-196 
Trace of Frobenius, 23 
Translation-by-Q map, 18 
Triple of integers, 2 
Twisted regular representation, 196 
Twists 

of elliptic modular surfaces, 

449-450 
of modular curves, 449-450, 469 


Unicity of prolongations, 153 
Unique factorization, failure of, 508 
Unit elements, 122 
Universal characters, 263 
Universal coefficient-ring, 261 
Universal deformation, 12, 259, 
426-427 
Universal deformation functors, 394 
Universal deformation rings, 12, 
259, 313, 362, 426, 481-482 
explicit construction of, 313-325 
Hecke rings and, 421-444 
structure of, 436-442 
Universal deformation space, 259 
Universal modular deformation, 12 
Universal modular deformation 
ring, 12 
Unramified galois representations, 6 
Unramified induced representation, 
178 
Unramified quotient, 179 
Unramifiedness criterion, 390 
Upper half-plane 
Hecke correspondences on, 66—67 
quotient of, see Quotient of 
upper half-plane 


Vandiver’s conjecture, 506, 516, 
520 


Waldspurger’s theorem, 566 
Weak finiteness, 286 

Weak near representability, 281 
Weber function on FE, 37 
Wedderburn’s Theorem, 385 
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Weierstrass ¢-function, 24 
Weierstrass o-function, 25 
Weierstrass equation, 2, 17 
global minimal, 31 
minimal, 27 
Weierstrass model, 450 
Weierstrass parameterization of 
elliptic curves, 52 
Weil-Chatelet group, 30 
Weil form, 183 
Weil group, 175, 176 
Weil pairing, 21-22, 157 
Wild ramification group, 214 
Wiles, Andrew, 2 
Wiles-Lenstra criterion, 488 
Wiles’ “main conjecture,” modular 
deformations and, 357-370 
Wiles’ numerical criterion, 13 
Wiles’ results, extension of, 475-488 
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elliptic curves and, 552-557 
geometric versions of, 555-557 
Hasse-Weil £-function and, 
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Zariski-Nagata theorem, 389 

Zariski tangent A-modules, 273 
cohomological interpretation of, 
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